All language subtitles for Harvard-CS50-Full-Computer-Science-University-Course_en

af Afrikaans
ak Akan
sq Albanian
am Amharic
hy Armenian
az Azerbaijani
eu Basque
be Belarusian
bem Bemba
bh Bihari
bs Bosnian
br Breton
bg Bulgarian
km Cambodian
ca Catalan
ceb Cebuano
chr Cherokee
ny Chichewa
zh-CN Chinese (Simplified)
zh-TW Chinese (Traditional)
co Corsican
hr Croatian
cs Czech
da Danish
nl Dutch
eo Esperanto
et Estonian
ee Ewe
fo Faroese
tl Filipino
fi Finnish
fy Frisian
gaa Ga
gl Galician
ka Georgian
de German
el Greek
gn Guarani
gu Gujarati
ht Haitian Creole
ha Hausa
haw Hawaiian
iw Hebrew
hi Hindi
hmn Hmong
hu Hungarian
is Icelandic
ig Igbo
id Indonesian
ia Interlingua
ga Irish
it Italian
ja Japanese
jw Javanese
kn Kannada
kk Kazakh
rw Kinyarwanda
rn Kirundi
kg Kongo
ko Korean
kri Krio (Sierra Leone)
ku Kurdish
ckb Kurdish (Soranรฎ)
ky Kyrgyz
lo Laothian
la Latin
lv Latvian
ln Lingala
lt Lithuanian
loz Lozi
lg Luganda
ach Luo
lb Luxembourgish
mk Macedonian
mg Malagasy
ms Malay
ml Malayalam
mt Maltese
mi Maori
mr Marathi
mfe Mauritian Creole
mo Moldavian
mn Mongolian
my Myanmar (Burmese)
sr-ME Montenegrin
ne Nepali
pcm Nigerian Pidgin
nso Northern Sotho
no Norwegian
nn Norwegian (Nynorsk)
oc Occitan
or Oriya
om Oromo
ps Pashto
pl Polish
pt-BR Portuguese (Brazil)
pt Portuguese (Portugal) Download
pa Punjabi
qu Quechua
ro Romanian
rm Romansh
nyn Runyakitara
ru Russian
sm Samoan
gd Scots Gaelic
sr Serbian
sh Serbo-Croatian
st Sesotho
tn Setswana
crs Seychellois Creole
sn Shona
sd Sindhi
si Sinhalese
sk Slovak
sl Slovenian
so Somali
es Spanish
es-419 Spanish (Latin American) Download
su Sundanese
sw Swahili
sv Swedish
tg Tajik
ta Tamil
tt Tatar
te Telugu
th Thai
ti Tigrinya
to Tonga
lua Tshiluba
tum Tumbuka
tr Turkish
tk Turkmen
tw Twi
ug Uighur
uk Ukrainian
ur Urdu
uz Uzbek
vi Vietnamese
cy Welsh
wo Wolof
xh Xhosa
yi Yiddish
yo Yoruba
zu Zulu
Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 00:00:00,000 --> 00:00:04,000 If you want to learn about computer science and the art\n 2 00:00:04,000 --> 00:00:09,300 CS50 is considered by many to be one of the\n 3 00:00:09,300 --> 00:00:13,000 This is a Harvard University course\ntaught by Dr. David Malan 4 00:00:13,000 --> 00:00:17,000 and we are proud to bring it to\n 5 00:00:17,000 --> 00:00:23,000 Throughout a series of lectures, Dr. Malan will teach you \n 6 00:00:23,000 --> 00:00:27,300 And make sure to check the description for a lot of\n 7 00:01:45,801 --> 00:01:50,281 DAVID MALAN: All right, this is CS50,\n 8 00:01:50,281 --> 00:01:52,591 to the intellectual\nenterprises of computer science 9 00:01:52,590 --> 00:01:56,340 and the art of programming, back here\n 10 00:01:56,340 --> 00:01:58,410 for the first time in quite a while. 11 00:02:13,311 --> 00:02:16,911 And I took this class myself\nsome time ago, but almost didn't. 12 00:02:16,911 --> 00:02:20,121 It was sophomore fall and I\nwas sitting in on the class. 13 00:02:20,121 --> 00:02:22,431 And I was a little curious\nbut, eh, it didn't really 14 00:02:24,508 --> 00:02:26,841 I was definitely a computer\nperson, but computer science 15 00:02:26,841 --> 00:02:28,314 felt like something altogether. 16 00:02:28,313 --> 00:02:30,230 And I only got up the\nnerve to take the class 17 00:02:30,230 --> 00:02:32,870 ultimately, because the professor\nat the time, Brian Kernighan 18 00:02:32,871 --> 00:02:35,600 allowed me to take the\nclass pass/fail, initially. 19 00:02:35,600 --> 00:02:37,490 And that is what made\nall the difference. 20 00:02:37,491 --> 00:02:39,981 I quickly found that\ncomputer science is not just 21 00:02:39,980 --> 00:02:42,800 about programming and working\nin isolation on your computer. 22 00:02:42,800 --> 00:02:45,390 It's really about problem\nsolving more generally. 23 00:02:45,390 --> 00:02:48,080 And there was something\nabout homework, frankly 24 00:02:48,080 --> 00:02:51,470 that was, like, actually fun for perhaps\n 25 00:02:51,471 --> 00:02:53,996 And there was something\nabout this ability 26 00:02:53,996 --> 00:02:56,121 that I discovered, along\nwith all of my classmates 27 00:02:56,121 --> 00:03:00,373 to actually create something and bring\n 28 00:03:00,372 --> 00:03:03,080 and sort of bring to bear something\n 29 00:03:03,080 --> 00:03:06,260 but didn't really know how to harness,\n 30 00:03:06,260 --> 00:03:08,150 and definitely challenging\nand frustrating. 31 00:03:08,151 --> 00:03:10,753 Like, to this day,\nall these years later 32 00:03:10,752 --> 00:03:13,460 you're going to run up against\n 33 00:03:13,461 --> 00:03:15,111 in programming, that\njust drive you nuts. 34 00:03:15,110 --> 00:03:16,610 And you feel like you've hit a wall. 35 00:03:16,610 --> 00:03:18,950 But the trick really is\nto give it enough time 36 00:03:18,950 --> 00:03:21,180 to take a step back, take\na break when you need to. 37 00:03:21,181 --> 00:03:24,441 And there's nothing better, I daresay,\n 38 00:03:24,441 --> 00:03:26,169 and pride, really,\nwhen you get something 39 00:03:26,169 --> 00:03:28,461 to work, and in a class like\nthis, present, ultimately 40 00:03:28,461 --> 00:03:32,091 at term's end, something like\nyour very own final project. 41 00:03:32,091 --> 00:03:35,551 Now, this isn't to say that\nI took to it 100% perfectly. 42 00:03:35,550 --> 00:03:40,760 In fact, just this past week, I looked\n 43 00:03:40,760 --> 00:03:43,281 have from some 25 years\nago, and took a photo 44 00:03:43,281 --> 00:03:47,961 of what was apparently the very first\n 45 00:03:47,961 --> 00:03:50,271 and quickly received minus 2 points on. 46 00:03:50,270 --> 00:03:53,450 But this is a program that we'll\n 47 00:03:53,450 --> 00:03:57,740 does something quite simply like\n 48 00:03:58,408 --> 00:04:00,200 And to be fair, I\ntechnically hadn't really 49 00:04:00,200 --> 00:04:02,480 followed the directions, which is\n 50 00:04:02,480 --> 00:04:05,802 But if you just look at this, especially\n 51 00:04:05,802 --> 00:04:07,760 you might have heard\nabout programming language 52 00:04:07,760 --> 00:04:09,718 but you've never typed\nsomething like this out 53 00:04:09,718 --> 00:04:11,480 undoubtedly it's going to look cryptic. 54 00:04:11,480 --> 00:04:13,520 But unlike human\nlanguages, frankly, which 55 00:04:13,520 --> 00:04:17,480 were a lot more sophisticated, a\nlot more vocabulary, a lot more 56 00:04:17,480 --> 00:04:21,620 grammatical rules, programming, once\n 57 00:04:21,620 --> 00:04:24,733 it is and how it works and what these\n 58 00:04:24,733 --> 00:04:26,900 you'll see, after a few\nmonths of a class like this 59 00:04:26,901 --> 00:04:29,001 to start teaching\nyourself, subsequently 60 00:04:29,000 --> 00:04:32,730 other languages, as they may\ncome, in the coming years as well. 61 00:04:32,730 --> 00:04:36,050 So what ultimately matters\nin this particular course 62 00:04:36,050 --> 00:04:38,690 is not so much where you end\nup relative to your classmates 63 00:04:38,690 --> 00:04:41,900 but where you end up relative\nto yourself when you began. 64 00:04:41,901 --> 00:04:43,381 And indeed, you'll begin today. 65 00:04:43,380 --> 00:04:46,910 And the only experience that matters\n 66 00:04:46,911 --> 00:04:49,040 And so, consider where you are today. 67 00:04:49,040 --> 00:04:51,531 Consider, perhaps, just how\ncryptic something like that 68 00:04:52,850 --> 00:04:56,180 And take comfort in knowing just\n 69 00:04:56,180 --> 00:04:58,408 will be within your own grasp. 70 00:04:58,408 --> 00:05:01,700 And if you're thinking that, OK, surely\n 71 00:05:01,701 --> 00:05:05,421 to the right, behind me, knows more than\n 72 00:05:05,420 --> 00:05:10,100 2/3 of CS50 students have never taken\n 73 00:05:10,100 --> 00:05:14,730 you're in very good company\nthroughout this whole term. 74 00:05:14,730 --> 00:05:16,820 So then, what is computer science? 75 00:05:16,821 --> 00:05:18,441 I claim that it's problem solving. 76 00:05:18,440 --> 00:05:20,720 And the upside of that is\nthat problem solving is 77 00:05:20,721 --> 00:05:23,031 something we sort of do all the time. 78 00:05:23,031 --> 00:05:25,834 But a computer science\nclass, learning to program 79 00:05:25,834 --> 00:05:27,500 I think kind of cleans up your thoughts. 80 00:05:27,500 --> 00:05:31,040 It helps you learn how to think more\n 81 00:05:32,630 --> 00:05:34,463 Because, honestly, the\ncomputer is not going 82 00:05:34,463 --> 00:05:37,670 to do what you want unless you are\n 83 00:05:37,670 --> 00:05:39,896 And so, as such, there's\nthese fringe benefits 84 00:05:39,896 --> 00:05:42,771 of just learning to think like a\n 85 00:05:42,771 --> 00:05:45,561 And it doesn't take all\nthat much to start doing so. 86 00:05:45,560 --> 00:05:49,221 This, for instance, is perhaps the\n 87 00:05:49,221 --> 00:05:51,141 sure, but really problem\nsolving in general. 88 00:05:51,141 --> 00:05:54,471 Problems are all about taking input,\n 89 00:05:54,471 --> 00:05:56,360 You want to get the solution, a.k.a. 90 00:05:57,050 --> 00:05:59,750 And so, something interesting\nhas got to be happening in here 91 00:05:59,750 --> 00:06:03,240 in here, when you're trying to\nget from those inputs to outputs. 92 00:06:03,240 --> 00:06:05,810 Now, in the world of\ncomputers specifically 93 00:06:05,810 --> 00:06:09,680 we need to decide in advance how we\n 94 00:06:09,680 --> 00:06:13,723 We all just need to decide, whether\n 95 00:06:13,723 --> 00:06:16,640 else, that we're all going to speak\n 96 00:06:16,641 --> 00:06:18,841 of our human languages as well. 97 00:06:18,841 --> 00:06:22,550 And you may very well know that\ncomputers tend to speak only 98 00:06:26,740 --> 00:06:29,230 Assembly, one, but binary,\ntwo, might be your go-to. 99 00:06:29,230 --> 00:06:32,320 And binary, by implying two,\nmeans that the world of computers 100 00:06:32,321 --> 00:06:35,981 has just two digits at\nits disposal, 0 and 1. 101 00:06:35,980 --> 00:06:40,060 And indeed, we humans have many more\n 102 00:06:40,690 --> 00:06:43,120 But a computer indeed\nonly has zeros and ones. 103 00:06:43,120 --> 00:06:45,341 And yet, somehow they can do so much. 104 00:06:45,341 --> 00:06:47,711 They can crunch numbers in\nExcel, send text messages 105 00:06:47,711 --> 00:06:51,081 create images and artwork\nand movies and more. 106 00:06:51,081 --> 00:06:54,790 And so, how do you get from something\n 107 00:06:54,790 --> 00:06:56,920 to all of the stuff\nthat we're doing today 108 00:06:56,920 --> 00:06:58,870 in our pockets and laptops and desktops? 109 00:06:58,870 --> 00:07:01,570 Well, it turns out that\nwe can start quite simply. 110 00:07:01,571 --> 00:07:05,471 If a computer were to want to do\n 111 00:07:06,190 --> 00:07:09,190 Well, in our human world,\nwe might count doing this 112 00:07:09,190 --> 00:07:13,778 like 1, 2, 3, 4, 5, using so-called\n 113 00:07:13,778 --> 00:07:16,570 on your fingers where one finger\n 114 00:07:16,571 --> 00:07:18,521 if I'm, for instance, taking attendance. 115 00:07:18,521 --> 00:07:22,600 Now, we humans would typically\nactually count 1, 2, 3, 4, 5, 6. 116 00:07:22,600 --> 00:07:25,480 And we'd go past just those five\ndigits and count much higher 117 00:07:26,951 --> 00:07:29,871 But computers, somehow, only\nhave these zeros and ones. 118 00:07:29,870 --> 00:07:33,190 So if a computer only somehow\nspeaks binary, zeros and ones 119 00:07:33,190 --> 00:07:36,180 how does it even count\npast the number 1? 120 00:07:36,180 --> 00:07:38,740 Well, here are 3 zeros, of course. 121 00:07:38,740 --> 00:07:42,250 And if you translate this\nnumber in binary, 000 122 00:07:42,250 --> 00:07:46,150 to a more familiar number in decimal,\n 123 00:07:47,021 --> 00:07:49,871 If we were to represent, with\na computer, the number 1 124 00:07:49,870 --> 00:07:52,930 it would actually be 001,\nwhich, not surprisingly 125 00:07:52,930 --> 00:07:55,990 is exactly the same as we\nmight do in our human world 126 00:07:55,990 --> 00:07:59,500 but we might not bother writing\n 127 00:07:59,500 --> 00:08:02,170 But a computer, now, if it\nwants to count as high as two 128 00:08:03,701 --> 00:08:06,431 And so it has to use a different\npattern of zeros and ones. 129 00:08:08,620 --> 00:08:10,810 So this is not 10 with\na zero in front of it. 130 00:08:10,810 --> 00:08:13,281 It's indeed zero one zero\nin the context of binary. 131 00:08:13,281 --> 00:08:15,401 And if we want to count\nhigher now than two 132 00:08:15,401 --> 00:08:19,540 we're going to have to tweak these\n 133 00:08:19,540 --> 00:08:24,011 And then if we want 4\nor 5 or 6 or 7, we're 134 00:08:24,011 --> 00:08:26,921 just kind of toggling these\nzeros and ones, a.k.a. 135 00:08:26,920 --> 00:08:31,408 bits, for binary digits that represent,\n 136 00:08:31,408 --> 00:08:33,490 different numbers that you\nand I, as humans, know 137 00:08:33,490 --> 00:08:36,730 of course, as the so-called\ndecimal system, 0 through 9 138 00:08:36,730 --> 00:08:40,390 dec implying 10, 10 digits,\nthose zeros through nine. 139 00:08:40,390 --> 00:08:42,760 So why that particular pattern? 140 00:08:42,760 --> 00:08:44,680 And why these particular zeros and ones? 141 00:08:44,681 --> 00:08:48,011 Well, it turns out that\nrepresenting one thing or the other 142 00:08:48,010 --> 00:08:50,360 is just really simple for a computer. 143 00:08:50,860 --> 00:08:53,110 At the end of the day, they're\npowered by electricity. 144 00:08:53,110 --> 00:08:56,081 And it's a really simple thing to\n 145 00:08:56,081 --> 00:08:57,791 or don't store some electricity. 146 00:08:57,791 --> 00:09:00,911 Like, that's as simple as\nthe world can get, on or off. 147 00:09:03,110 --> 00:09:05,821 So, in fact, inside of a\ncomputer, a phone, anything 148 00:09:05,821 --> 00:09:07,571 these days that's\nelectronic, pretty much 149 00:09:07,571 --> 00:09:10,763 is some number of switches,\notherwise known as transistors. 150 00:09:11,471 --> 00:09:14,852 You've got thousands, millions of them\n 151 00:09:14,852 --> 00:09:17,810 And these are just tiny little switches\n 152 00:09:17,811 --> 00:09:20,471 And by turning those things\non and off in patterns 153 00:09:20,471 --> 00:09:24,274 a computer can count from 0 on up\n 154 00:09:24,274 --> 00:09:27,191 And so these switches, really, you\n 155 00:09:27,691 --> 00:09:29,983 Let me just borrow one of\nour little stage lights here. 156 00:09:32,260 --> 00:09:34,690 And so, I could just think\nof this as representing 157 00:09:34,691 --> 00:09:38,110 in my laptop, a transistor,\na switch, representing 0. 158 00:09:38,110 --> 00:09:43,120 But if I allow some electricity\n 159 00:09:43,120 --> 00:09:44,730 Well, how do I count higher than 1? 160 00:09:44,730 --> 00:09:46,341 I, of course, need another light bulb. 161 00:09:46,341 --> 00:09:48,531 So let me grab another one here. 162 00:09:48,530 --> 00:09:53,350 And if I put it in that same kind of\n 163 00:09:53,350 --> 00:09:57,140 That's sort of the old finger\ncounting way of unary, just 1, 2. 164 00:09:57,140 --> 00:09:59,140 I want to actually take\ninto account the pattern 165 00:09:59,140 --> 00:10:00,680 of these things being on and off. 166 00:10:00,681 --> 00:10:06,730 So if this was one a moment ago, what I\n 167 00:10:06,730 --> 00:10:10,660 and let the next one over be on, a.k.a. 168 00:10:12,071 --> 00:10:15,110 And let me get us a\nthird bit, if you will. 169 00:10:16,600 --> 00:10:20,440 Here is that same pattern now,\nstarting at the beginning with 3. 170 00:10:25,721 --> 00:10:32,770 Here is 010, a.k.a., in our\nhuman world of decimal, 2. 171 00:10:32,770 --> 00:10:35,170 And then we could, of course,\nkeep counting further. 172 00:10:35,171 --> 00:10:37,941 This now would be 3 and dot dot dot. 173 00:10:37,941 --> 00:10:40,870 If this other bulb now goes\non, and that switch is turned 174 00:10:40,870 --> 00:10:43,360 and all three stay on--\nthis, again, was what number? 175 00:10:45,581 --> 00:10:49,811 So it's just as simple,\nrelatively, as that, if you will. 176 00:10:49,811 --> 00:10:53,980 But how is it that these\npatterns came to be? 177 00:10:53,980 --> 00:10:56,806 Well, these patterns actually\nfollow something very familiar. 178 00:10:56,806 --> 00:10:58,931 You and I don't really\nthink about it at this level 179 00:10:58,931 --> 00:11:02,801 anymore because we've probably been\n 180 00:11:04,270 --> 00:11:09,160 But if we consider something in\ndecimal, like the number 123 181 00:11:10,301 --> 00:11:12,971 This looks like 123 in decimal. 182 00:11:13,660 --> 00:11:17,800 It's really just three symbols,\n 183 00:11:17,801 --> 00:11:20,291 with a couple of curves, that\nyou and I now instinctively 184 00:11:22,150 --> 00:11:27,000 But if we do rewind a few years,\n 185 00:11:27,000 --> 00:11:30,480 because you're assigning meaning\nto each of these columns. 186 00:11:30,480 --> 00:11:33,331 The 3 is in the so-called ones place. 187 00:11:33,331 --> 00:11:36,931 The 2 is in the so-called tens place. 188 00:11:36,931 --> 00:11:39,841 And the 1 is in the\nso-called hundreds place. 189 00:11:39,841 --> 00:11:41,850 And then the math ensues\nquickly in your head. 190 00:11:41,850 --> 00:11:47,311 This is technically 100 times 1, plus\n 191 00:11:48,990 --> 00:11:54,390 And there we get the sort of\nmathematical notion we know as 123. 192 00:11:54,390 --> 00:11:58,570 Well, nicely enough, in binary,\nit's actually the same thing. 193 00:11:58,571 --> 00:12:01,021 It's just these columns mean\na little something different. 194 00:12:01,020 --> 00:12:05,010 If you use three digits in decimal,\nand you have the ones place 195 00:12:05,010 --> 00:12:09,331 the tens place, and the hundreds place,\n 196 00:12:09,331 --> 00:12:11,230 They're technically just powers of 10. 197 00:12:11,230 --> 00:12:13,620 So 10 to the 0, 10 to\nthe 1, 10 to the 2. 198 00:12:14,520 --> 00:12:16,440 Decimal system, "dec" meaning 10. 199 00:12:16,441 --> 00:12:18,841 You have 8 and 10 digits, 0 through 9. 200 00:12:18,841 --> 00:12:21,390 In the binary system, if you're\ngoing to use three digits 201 00:12:21,390 --> 00:12:24,910 just change the bases if you're\nusing only zeros and ones. 202 00:12:24,910 --> 00:12:29,130 So now it's powers of 2, 2 to the\n 203 00:12:31,900 --> 00:12:36,120 And if you keep going, it's going\n 204 00:12:37,510 --> 00:12:40,260 So, why did we get these\npatterns that we did? 205 00:12:40,260 --> 00:12:46,290 Here's your 000 because it's 4 times\n 206 00:12:46,291 --> 00:12:49,471 This is why we got the\ndecimal number 1 in binary. 207 00:12:49,471 --> 00:12:53,341 This is why we got the number 2\nin binary, because it's 4 times 208 00:12:53,341 --> 00:13:01,051 0, plus 2 times 1, plus 1 times 0, and\n 209 00:13:01,831 --> 00:13:05,373 And, of course, if you wanted to\n 210 00:13:06,331 --> 00:13:08,831 What does a computer need to\ndo to count even higher than 7? 211 00:13:10,880 --> 00:13:12,530 Add another light bulb, another switch. 212 00:13:12,530 --> 00:13:14,810 And, indeed, computers\nhave standardized just how 213 00:13:14,811 --> 00:13:17,240 many zeros and ones,\nor bits or switches 214 00:13:17,240 --> 00:13:19,020 they throw at these kinds of problems. 215 00:13:19,020 --> 00:13:23,300 And, in fact, most computers would\n 216 00:13:23,301 --> 00:13:25,971 And even if you're only counting\nas high as three or seven 217 00:13:25,971 --> 00:13:28,431 you would still use eight and\nhave a whole bunch of zeros. 218 00:13:28,431 --> 00:13:31,551 But that's OK, because the\ncomputers these days certainly 219 00:13:31,551 --> 00:13:35,301 have so many more, thousands,\n 220 00:13:37,921 --> 00:13:41,671 All right, so, with that said, if\n 221 00:13:41,671 --> 00:13:44,161 or, frankly, as high\nas we want, that only 222 00:13:44,160 --> 00:13:46,770 seems to make computers\nuseful for things like Excel 223 00:13:48,030 --> 00:13:50,280 But computers, of course,\nlet you send text messages 224 00:13:50,280 --> 00:13:52,240 write documents, and so much more. 225 00:13:52,240 --> 00:13:55,530 So how would a computer represent\nsomething like a letter 226 00:13:55,530 --> 00:13:59,860 like the letter A of the English\n 227 00:14:03,721 --> 00:14:05,961 AUDIENCE: You can represent\nletters in numbers. 228 00:14:05,961 --> 00:14:08,571 DAVID MALAN: OK, so we could\nrepresent letters using numbers. 229 00:14:10,004 --> 00:14:11,421 What number should represent what? 230 00:14:11,421 --> 00:14:15,221 AUDIENCE: Say if you were starting\n 231 00:14:15,221 --> 00:14:18,745 you could say 1 is A, 2 is B, 3 is C. 232 00:14:19,620 --> 00:14:22,630 Yeah, we just all have to agree\nsomehow that one number is 233 00:14:22,630 --> 00:14:23,880 going to represent one letter. 234 00:14:23,880 --> 00:14:28,490 So 1 is A, 2 is B, 3 is\nC, Z is 26, and so forth. 235 00:14:28,490 --> 00:14:30,990 Maybe we can even take into\naccount uppercase and lowercase. 236 00:14:30,990 --> 00:14:34,230 We just have to agree and sort of\n 237 00:14:34,230 --> 00:14:36,461 And humans, indeed, did just that. 238 00:14:38,010 --> 00:14:40,110 It turns out they started\na little higher up. 239 00:14:40,110 --> 00:14:44,130 Capital A has been\nstandardized as the number 65. 240 00:14:44,130 --> 00:14:47,581 And capital B has been\nstandardized as the number 66. 241 00:14:47,581 --> 00:14:50,370 And you can kind of imagine\nhow it goes up from there. 242 00:14:50,370 --> 00:14:53,250 And that's because whatever\nyou're representing 243 00:14:53,250 --> 00:14:57,461 ultimately, can only be stored, at\n 244 00:14:57,461 --> 00:15:01,110 And so, some humans in a room before,\n 245 00:15:01,110 --> 00:15:05,100 or, really, this pattern of zeros\n 246 00:15:08,791 --> 00:15:12,481 So if that pattern of zeros and\nones ever appears in a computer 247 00:15:12,480 --> 00:15:17,910 it might be interpreted then as indeed\n 248 00:15:18,811 --> 00:15:23,191 But I worry, just to be clear, we\n 249 00:15:23,191 --> 00:15:25,681 It might seem, if I play\nthis naively, that, OK 250 00:15:25,681 --> 00:15:28,860 how do I now actually do\nmath with the number 65? 251 00:15:28,860 --> 00:15:33,571 If now Excel displays 65 is\nan A, let alone Bs and Cs. 252 00:15:33,571 --> 00:15:37,230 So how might a computer\ndo as you've proposed 253 00:15:37,230 --> 00:15:41,875 have this mapping from numbers to\n 254 00:15:41,875 --> 00:15:43,500 It feels like we've given something up. 255 00:15:44,000 --> 00:15:46,262 AUDIENCE: By having\na prefix for letters? 256 00:15:46,263 --> 00:15:47,596 DAVID MALAN: By having a prefix? 257 00:15:47,596 --> 00:15:48,887 AUDIENCE: You could have\nprefixes and suffixes. 258 00:15:48,886 --> 00:15:51,640 DAVID MALAN: OK, so we could\nperhaps have some kind of prefix 259 00:15:51,640 --> 00:15:53,230 like some pattern of zeros and ones-- 260 00:15:53,230 --> 00:15:56,201 I like this-- that\nindicates to the computer 261 00:15:56,201 --> 00:15:58,870 here comes another pattern\nthat represents a letter. 262 00:15:58,870 --> 00:16:02,920 Here comes another pattern that\nrepresents a number or a letter. 263 00:16:05,620 --> 00:16:08,480 How might a computer\ndistinguish these two? 264 00:16:08,980 --> 00:16:11,360 AUDIENCE: Have a\ndifferent file format, so 265 00:16:11,360 --> 00:16:16,120 like, odd text or just\ncheck the graphic or-- 266 00:16:16,120 --> 00:16:17,841 DAVID MALAN: Indeed, and that's spot-on. 267 00:16:17,841 --> 00:16:20,781 Nothing wrong with what you suggested,\n 268 00:16:20,780 --> 00:16:23,613 The reason we have all of these\n 269 00:16:23,614 --> 00:16:28,521 like JPEG and GIF and PNGs\nand Word documents, .docx 270 00:16:28,520 --> 00:16:32,270 and Excel files and so forth, is\n 271 00:16:32,270 --> 00:16:35,780 and decided, well, in the context\n 272 00:16:35,780 --> 00:16:38,600 more specifically, in the\ncontext of this type of program 273 00:16:38,600 --> 00:16:42,140 Excel versus Photoshop versus\nGoogle Docs or the like 274 00:16:42,140 --> 00:16:46,550 we shall interpret any patterns of\n 275 00:16:46,551 --> 00:16:51,051 for Excel, maybe letters in, like, a\n 276 00:16:51,051 --> 00:16:54,471 or maybe even colors of the rainbow\n 277 00:16:55,610 --> 00:16:58,490 And we'll see, when we\nourselves start programming 278 00:16:58,490 --> 00:17:00,471 you the programmer\nwill ultimately provide 279 00:17:00,471 --> 00:17:05,671 some hints to the computer that tells\n 280 00:17:05,671 --> 00:17:08,631 So, similar in spirit to that, but\n 281 00:17:09,260 --> 00:17:12,651 So this system here actually has a\n 282 00:17:12,651 --> 00:17:14,421 Code for Information Interchange. 283 00:17:14,421 --> 00:17:16,760 And indeed, it began here\nin the US, and that's 284 00:17:16,760 --> 00:17:19,550 why it's actually a little\nbiased toward A's through Z's 285 00:17:19,550 --> 00:17:21,300 and a bit of punctuation as well. 286 00:17:21,300 --> 00:17:22,920 And that quickly became a problem. 287 00:17:22,921 --> 00:17:26,780 But if we start simply now,\nin English, the mapping 288 00:17:26,780 --> 00:17:28,860 itself is fairly straightforward. 289 00:17:28,861 --> 00:17:33,291 So if A is 65, B it 66,\nand dot dot dot, suppose 290 00:17:33,290 --> 00:17:36,020 that you received a text\nmessage, an email, from a friend 291 00:17:36,020 --> 00:17:39,080 and underneath the hood,\nso to speak, if you kind of 292 00:17:39,080 --> 00:17:42,740 looked inside the computer, what you\n 293 00:17:42,740 --> 00:17:48,080 or this email happened to\nbe the numbers 72, 73, 33 294 00:17:48,080 --> 00:17:50,810 or, really, the underlying\npattern of zeros and ones. 295 00:17:50,810 --> 00:17:56,376 What might your friend have sent you\n 296 00:18:02,280 --> 00:18:06,780 Well, apparently, according to this\n 297 00:18:06,780 --> 00:18:09,421 It's not obvious from\nthis chart what the 33 is 298 00:18:09,421 --> 00:18:11,520 but indeed, this\npattern represents "hi. 299 00:18:11,520 --> 00:18:13,773 And anyone want to guess,\nor if you know, what 33 is? 300 00:18:13,773 --> 00:18:14,940 AUDIENCE: Exclamation point. 301 00:18:14,941 --> 00:18:16,020 DAVID MALAN: Exclamation point. 302 00:18:16,020 --> 00:18:18,562 And this is, frankly, not the\nkind of thing most people know. 303 00:18:18,563 --> 00:18:22,241 But it's easily accessible by a\n 304 00:18:23,671 --> 00:18:26,431 When I said that we just need to\n 305 00:18:27,510 --> 00:18:29,302 They wrote it down in\na book or in a chart. 306 00:18:29,303 --> 00:18:33,961 And, for instance, here is our\n72 for H, here is our 73 for I 307 00:18:33,961 --> 00:18:37,861 and here is our 33\nfor exclamation point. 308 00:18:37,861 --> 00:18:41,191 And computers, Macs, PCs,\niPhones, Android devices 309 00:18:41,191 --> 00:18:43,721 just know this mapping\nby heart, if you will. 310 00:18:43,721 --> 00:18:46,451 They've been designed to\nunderstand those letters. 311 00:18:46,451 --> 00:18:48,121 So here, I might have received "hi. 312 00:18:48,121 --> 00:18:51,750 Technically, what I've received is\n 313 00:18:51,750 --> 00:18:54,960 But it's important to note that when\n 314 00:18:54,960 --> 00:18:58,020 in any format, be it\nemail or text or a file 315 00:18:58,020 --> 00:19:00,661 they do tend to come\nin standard lengths 316 00:19:00,661 --> 00:19:04,351 with a certain number of\nzeros and ones altogether. 317 00:19:04,351 --> 00:19:07,200 And this happens to be 8 plus 8, plus 8. 318 00:19:07,200 --> 00:19:10,561 So just to get the message\n"hi, exclamation point 319 00:19:10,560 --> 00:19:15,000 you would have received at least,\nit would seem, some 24 bits. 320 00:19:15,000 --> 00:19:18,330 But frankly, bits are so tiny,\nliterally and mathematically 321 00:19:18,330 --> 00:19:21,060 that we don't tend to think or\n 322 00:19:21,060 --> 00:19:23,370 You're probably more\nfamiliar with bytes. 323 00:19:23,371 --> 00:19:27,480 B-Y-T-E-S is a byte,\nis a byte, is a byte. 324 00:19:29,340 --> 00:19:32,373 And even those, frankly, aren't\n 325 00:19:32,374 --> 00:19:34,290 How high can you count\nif you have eight bits? 326 00:19:39,270 --> 00:19:42,570 Unless you want to go\nnegative, that's fine. 327 00:19:45,151 --> 00:19:48,631 Long story short, if we actually got\n 328 00:19:48,631 --> 00:19:54,061 and ones, and we figured out what\n 329 00:19:54,060 --> 00:19:57,570 to in decimal, it would\nindeed be 255, or less 330 00:19:57,570 --> 00:20:00,280 if you want to represent\nnegative numbers as well. 331 00:20:00,280 --> 00:20:04,140 So this is useful because now we can\n 332 00:20:04,141 --> 00:20:06,811 but, if the files are bigger,\nkilobytes is thousands of bytes 333 00:20:06,810 --> 00:20:10,590 megabytes is millions of bytes,\ngigabytes is billions of bytes 334 00:20:10,590 --> 00:20:14,080 terabytes are trillions\nof bytes, and so forth. 335 00:20:14,080 --> 00:20:20,520 We have a vocabulary for these\n 336 00:20:20,520 --> 00:20:24,350 The problem is that, if you're using\n 337 00:20:24,351 --> 00:20:27,801 byte per character, and\noriginally, only seven, you 338 00:20:27,800 --> 00:20:30,390 can only represent 255 characters. 339 00:20:30,391 --> 00:20:33,980 And that's actually 256 total\ncharacters, including zero. 340 00:20:33,980 --> 00:20:37,790 And that's fine if you're using\nliterally English, in this case 341 00:20:39,141 --> 00:20:42,081 But there's many human\nlanguages in the world 342 00:20:42,080 --> 00:20:45,410 that need many more symbols\nand, therefore, many more bits. 343 00:20:45,411 --> 00:20:48,291 So, thankfully, the world\ndecided that we'll indeed 344 00:20:48,290 --> 00:20:51,441 support not just the US\nEnglish keyboard, but all 345 00:20:51,441 --> 00:20:54,500 of the accented characters that\n 346 00:20:54,500 --> 00:20:57,740 And heck, if we use enough\nbits, zeros and ones 347 00:20:57,740 --> 00:21:01,730 not only can we represent all\nhuman languages in written form 348 00:21:01,730 --> 00:21:03,651 as well as some emotions\nalong the way, we 349 00:21:03,651 --> 00:21:06,621 can capture the latter with\nthese things called emojis. 350 00:21:06,621 --> 00:21:09,230 And indeed, these are very\nmuch in vogue these days. 351 00:21:09,230 --> 00:21:12,951 You probably send and/or receive\n 352 00:21:12,951 --> 00:21:16,520 These are just characters, like\nletters of an alphabet, patterns 353 00:21:16,520 --> 00:21:20,570 of zeros and ones that you're receiving,\n 354 00:21:20,570 --> 00:21:22,690 For instance, there\nare certain emojis that 355 00:21:22,691 --> 00:21:24,721 are represented with\ncertain patterns of bits. 356 00:21:24,721 --> 00:21:28,070 And when you receive them, your\n 357 00:21:30,000 --> 00:21:32,780 And this newer standard\nis called Unicode. 358 00:21:32,780 --> 00:21:35,270 So it's a superset of\nwhat we called ASCII. 359 00:21:35,270 --> 00:21:39,951 And Unicode is just a mapping of many\n 360 00:21:39,951 --> 00:21:42,320 or characters, more\ngenerally, that might 361 00:21:42,320 --> 00:21:45,140 use eight bits for\nbackwards compatibility 362 00:21:45,141 --> 00:21:49,821 with the old way of doing things with\n 363 00:21:49,820 --> 00:21:51,680 And if you have 16\nbits, you can actually 364 00:21:51,681 --> 00:21:54,713 represent more than\n65,000 possible letters. 365 00:21:54,713 --> 00:21:55,880 And that's getting up there. 366 00:21:55,881 --> 00:22:01,341 And heck, Unicode might even use 32\n 367 00:22:01,340 --> 00:22:03,230 and punctuation symbols and emojis. 368 00:22:03,230 --> 00:22:06,411 And that would give you up\nto 4 billion possibilities. 369 00:22:06,411 --> 00:22:09,980 And, I daresay, one of the reasons we\n 370 00:22:11,060 --> 00:22:14,060 I mean, we've got room for\nbillions more, literally. 371 00:22:14,060 --> 00:22:16,280 So, in fact, just as a\nlittle bit of trivia 372 00:22:16,280 --> 00:22:21,500 has anyone ever received this decimal\n 373 00:22:21,500 --> 00:22:25,971 has anyone ever received this pattern\n 374 00:22:25,971 --> 00:22:29,070 in a text or an email,\nperhaps this past year? 375 00:22:29,070 --> 00:22:33,951 Well, if you actually look this up,\n 376 00:22:33,951 --> 00:22:37,131 happens to represent\nface with medical mask. 377 00:22:37,131 --> 00:22:40,941 And notice that if you've got\nan iPhone or an Android device 378 00:22:40,941 --> 00:22:43,290 you might be seeing different things. 379 00:22:43,290 --> 00:22:46,580 In fact, this is the Android\nversion of this, most recently. 380 00:22:46,580 --> 00:22:49,221 This is the iOS version\nof it, most recently. 381 00:22:49,221 --> 00:22:51,891 And there's bunches of other\ninterpretations by other companies 382 00:22:53,040 --> 00:22:55,730 So Unicode, as a\nconsortium, if you will 383 00:22:55,730 --> 00:22:58,861 has standardized the descriptions\nof what these things are. 384 00:22:58,861 --> 00:23:02,391 But the companies themselves,\nmanufacturers out there 385 00:23:02,391 --> 00:23:05,000 have generally interpreted\nit as you see fit. 386 00:23:05,000 --> 00:23:08,330 And this can lead to some\nhuman miscommunications. 387 00:23:08,330 --> 00:23:11,870 In fact, for like, literally,\n 388 00:23:11,871 --> 00:23:14,781 I started being in the habit of\n 389 00:23:14,780 --> 00:23:17,820 like this because I thought it was\n 390 00:23:17,820 --> 00:23:19,760 I didn't realize this\nis the emoji for hug 391 00:23:19,760 --> 00:23:24,350 because whatever device I was using\n 392 00:23:24,351 --> 00:23:27,171 And that's because of their\ninterpretation of the data. 393 00:23:27,171 --> 00:23:31,551 This has happened too when\nwhat was a gun became a water 394 00:23:31,550 --> 00:23:33,590 pistol in some manufacturers' eyes. 395 00:23:33,590 --> 00:23:37,971 And so it's an interesting dichotomy\n 396 00:23:37,971 --> 00:23:42,500 want to represent and how we\n 397 00:23:42,500 --> 00:23:45,891 Questions, then, on these\nrepresentations of formats 398 00:23:45,891 --> 00:23:49,021 be it numbers or letters, or soon more. 399 00:23:49,520 --> 00:23:52,140 AUDIENCE: Why is decimal\npopular for a computer 400 00:23:52,141 --> 00:23:54,739 if binary is the basis for everything? 401 00:23:54,739 --> 00:23:56,530 DAVID MALAN: Sorry,\nwhy is what so popular? 402 00:23:56,530 --> 00:23:59,310 AUDIENCE: Why is the decimal popular\n 403 00:23:59,310 --> 00:24:01,900 DAVID MALAN: Yeah, so we'll come\n 404 00:24:01,901 --> 00:24:03,811 There are other ways\nto represent numbers. 405 00:24:07,171 --> 00:24:12,480 And hexadecimal is yet a fourth that\n 406 00:24:12,480 --> 00:24:15,540 through 9 plus A, B, C,\nD, E, F. And somehow 407 00:24:15,540 --> 00:24:18,600 you can similarly count\neven higher with those. 408 00:24:18,601 --> 00:24:21,121 We'll see in a few weeks\nwhy this is compelling. 409 00:24:21,121 --> 00:24:24,781 But hexadecimal, long story\nshort, uses four bits per digit. 410 00:24:24,780 --> 00:24:28,290 And so, four bits, if you have two\n 411 00:24:28,290 --> 00:24:30,810 And it's just a very\nconvenient unit of measure. 412 00:24:30,810 --> 00:24:34,140 And it's also human convention in\n 413 00:24:34,141 --> 00:24:35,874 But we'll come back to that soon. 414 00:24:36,540 --> 00:24:39,923 AUDIENCE: Do the lights on the\nstage supposedly say that-- 415 00:24:39,923 --> 00:24:42,590 DAVID MALAN: Do the lights on the\nstage supposedly say anything? 416 00:24:42,590 --> 00:24:46,310 Well, if we had thought in advance\nto use maybe 64 light bulbs 417 00:24:46,310 --> 00:24:51,650 that would seem to give us 8\ntotal bytes on stage, 8 times 8 418 00:24:55,171 --> 00:24:58,447 Other questions on 0's and 1's? 419 00:24:58,446 --> 00:25:01,130 It's a little bright in here. 420 00:25:04,911 --> 00:25:08,931 Where everyone's pointing\nsomewhere specific. 421 00:25:11,391 --> 00:25:14,863 AUDIENCE: I was just going\nto ask about the 255 bits 422 00:25:14,863 --> 00:25:16,346 like with the maximum characters. 423 00:25:16,846 --> 00:25:19,555 DAVID MALAN: Ah, sure, and we'll\n 424 00:25:19,555 --> 00:25:22,131 in the coming days too,\nat a slower pace too 425 00:25:22,131 --> 00:25:26,074 we have, with eight bits, two\npossible values for the first 426 00:25:26,074 --> 00:25:28,490 and then two for the next, two\nfor the next, and so forth. 427 00:25:30,131 --> 00:25:32,230 That's 2 to the eighth\npower total, which 428 00:25:32,230 --> 00:25:36,730 means you can have 256 total\n 429 00:25:36,730 --> 00:25:40,580 But as we'll see soon computer\nscientists, programmers 430 00:25:40,580 --> 00:25:45,250 software often starts counting at 0 by\n 431 00:25:45,250 --> 00:25:50,980 patterns, 00000000 to represent\n 432 00:25:50,980 --> 00:25:56,830 you only have 255 other patterns left\n 433 00:25:59,421 --> 00:26:04,421 All right, so what then might we\n 434 00:26:04,990 --> 00:26:07,451 Well, we of course have things\nlike colors and programs 435 00:26:07,451 --> 00:26:09,682 like Photoshop and pictures and photos. 436 00:26:09,682 --> 00:26:11,141 Well let me ask the question again. 437 00:26:11,141 --> 00:26:14,351 How might a computer, do you think,\n 438 00:26:16,000 --> 00:26:19,510 Like what are our options if all we've\n 439 00:26:23,471 --> 00:26:27,070 RGB indeed is this acronym that\nrepresents some amount of red 440 00:26:27,070 --> 00:26:29,590 and some amount of green and\nblue and indeed computers 441 00:26:29,590 --> 00:26:32,380 can represent colors by just doing that. 442 00:26:32,381 --> 00:26:33,893 Remembering, for instance, this dot. 443 00:26:33,893 --> 00:26:36,851 This yellow dot on the screen that\n 444 00:26:36,851 --> 00:26:39,611 these days, well that's some amount\n 445 00:26:40,443 --> 00:26:42,340 And if you sort of mix\nthose colors together 446 00:26:42,340 --> 00:26:44,080 you can indeed get a very specific one. 447 00:26:44,080 --> 00:26:46,660 And we'll see you in\njust a moment just that. 448 00:26:46,661 --> 00:26:51,408 So indeed earlier on, humans\nonly used seven bits total. 449 00:26:51,407 --> 00:26:54,490 And it was only once they decided,\n 450 00:26:54,490 --> 00:26:57,100 got extended ASCII and\nthat was initially in part 451 00:26:57,101 --> 00:27:00,941 a solution to the same problem of\n 452 00:27:00,941 --> 00:27:05,248 in those patterns of zeros and ones\n 453 00:27:06,080 --> 00:27:10,330 But even that wasn't enough and that's\n 454 00:27:11,891 --> 00:27:16,961 So if we come back now to\nthis one particular color. 455 00:27:16,961 --> 00:27:19,330 RGB was proposed as a scheme,\nbut how might this work? 456 00:27:19,330 --> 00:27:21,280 Well, consider for instance this. 457 00:27:21,280 --> 00:27:25,661 If we do indeed decide as a group to\n 458 00:27:25,661 --> 00:27:28,421 with some mixture of some red,\nsome green, and some blue 459 00:27:28,421 --> 00:27:33,311 we have to decide how to represent\n 460 00:27:33,310 --> 00:27:37,030 Well, it turns out if all we have\n 461 00:27:38,351 --> 00:27:44,501 For instance, suppose a computer we're\n 462 00:27:44,500 --> 00:27:47,471 no longer in the context of\nan email or a text message 463 00:27:47,471 --> 00:27:51,701 but now in the context of something\n 464 00:27:51,701 --> 00:27:55,331 and creating graphical files,\nmaybe this first number 465 00:27:55,330 --> 00:28:00,140 could be interpreted as representing\n 466 00:28:00,688 --> 00:28:02,021 And that's exactly what happens. 467 00:28:02,020 --> 00:28:05,740 You can think of the first digit as\n 468 00:28:05,740 --> 00:28:10,030 And so ultimately when you combine that\n 469 00:28:10,030 --> 00:28:14,230 that amount of blue, it turns out it's\n 470 00:28:14,230 --> 00:28:18,040 And indeed, you can come up\nwith a numbers between 0 and 255 471 00:28:18,040 --> 00:28:21,461 for each of those colors to mix any\n 472 00:28:21,461 --> 00:28:23,330 And you can actually\nsee this in practice. 473 00:28:23,330 --> 00:28:26,530 Even though our screens,\nadmittedly, are getting really good 474 00:28:26,530 --> 00:28:30,739 on our phones and laptops such that you\n 475 00:28:30,739 --> 00:28:32,530 You might have heard\nthe term pixel before. 476 00:28:32,530 --> 00:28:34,661 Pixel's just a dot on\nthe screen and you've 477 00:28:34,661 --> 00:28:38,260 got thousands, millions of them these\n 478 00:28:38,260 --> 00:28:41,201 If I take even this\nemoji, which again happens 479 00:28:41,201 --> 00:28:46,211 to be one company's interpretation\nof a face with medical mask 480 00:28:46,211 --> 00:28:48,550 and zoom in a bit, maybe\nzoom in a bit more 481 00:28:48,550 --> 00:28:50,890 you can actually start\nto see these pixels. 482 00:28:50,891 --> 00:28:53,561 Things get pixelated\nbecause what you're seeing 483 00:28:53,560 --> 00:28:57,280 is each of the individual dots\n 484 00:28:57,280 --> 00:28:59,681 And apparently each of\nthese individual dots 485 00:28:59,681 --> 00:29:04,961 are probably using 24 bits, eight bits\n 486 00:29:04,961 --> 00:29:07,300 bits for blue, in some pattern. 487 00:29:07,300 --> 00:29:11,290 This program or some other like\n 488 00:29:11,290 --> 00:29:16,270 and it's white or yellow or\nblack or some brown in between. 489 00:29:16,270 --> 00:29:19,870 So if you look sort of awkwardly, but\n 490 00:29:19,871 --> 00:29:23,673 or maybe your TV, you can\nsee exactly this, too. 491 00:29:23,673 --> 00:29:25,631 All right, well, what\nabout things that we also 492 00:29:25,631 --> 00:29:27,256 watch every day on YouTube or the like? 493 00:29:28,540 --> 00:29:30,550 How would a computer,\nknowing what we know now 494 00:29:30,550 --> 00:29:32,545 represent something like a video? 495 00:29:35,080 --> 00:29:37,930 How might you represent a video\nusing only zeros and ones? 496 00:29:38,760 --> 00:29:43,260 AUDIENCE: As we can see here,\nthey represent images, right? 497 00:29:43,260 --> 00:29:47,760 [INAUDIBLE] sounds of\nthe 0 and 1s as well. 498 00:29:52,040 --> 00:29:55,410 To summarize, what video really\n 499 00:29:55,411 --> 00:29:58,131 It's not just one image, it's\nnot just one letter or a number 500 00:29:58,131 --> 00:30:01,911 it's presumably some kind of\nsequence because time is passing. 501 00:30:01,911 --> 00:30:05,780 So with a whole bunch of images,\nmaybe 24 maybe 30 per second 502 00:30:05,780 --> 00:30:08,480 if you fly them by the\nhuman's eyes, we can 503 00:30:08,480 --> 00:30:11,240 interpret them using our eyes\nand brain that there is now 504 00:30:11,240 --> 00:30:13,431 movement and therefore video. 505 00:30:13,431 --> 00:30:16,040 Similarly with audio or music. 506 00:30:16,040 --> 00:30:20,211 If we just came up with some convention\n 507 00:30:20,211 --> 00:30:23,361 on a musical instrument, could we have\n 508 00:30:23,361 --> 00:30:25,153 And this might be\nactually pretty familiar. 509 00:30:25,153 --> 00:30:29,901 Let me pull up a quick video here,\n 510 00:30:31,310 --> 00:30:32,990 You might remember from childhood. 511 00:30:54,961 --> 00:30:58,861 So granted that particular\nvideo is an actual video 512 00:30:58,861 --> 00:31:02,311 of a paper-based animation, but\n 513 00:31:02,310 --> 00:31:06,270 is some sequence of these images,\nwhich themselves of course 514 00:31:06,270 --> 00:31:10,320 are just zeros and ones because they're\n 515 00:31:10,320 --> 00:31:14,003 Now something like musical notes like\n 516 00:31:14,003 --> 00:31:16,170 might just naturally play\nthese on physical devices 517 00:31:16,171 --> 00:31:19,201 but computers can certainly\nrepresent those sounds, too. 518 00:31:19,201 --> 00:31:22,050 For instance, a popular\nformat for audio is 519 00:31:22,050 --> 00:31:24,780 called MIDI and MIDI\nmight just represent 520 00:31:24,780 --> 00:31:29,100 each note that you saw a moment ago\n 521 00:31:29,101 --> 00:31:32,491 But more generally, you might\nthink about music as having notes 522 00:31:32,490 --> 00:31:35,401 for instance, A through G, maybe\nsome flats and some sharps 523 00:31:35,401 --> 00:31:39,000 you might have the duration like how\n 524 00:31:39,000 --> 00:31:41,040 on a piano or some\nother device, and then 525 00:31:41,040 --> 00:31:43,770 just the volume like how hard\ndoes a human in the real world 526 00:31:43,770 --> 00:31:46,681 press down on that key and\ntherefore how loud is that sound? 527 00:31:46,681 --> 00:31:51,571 It would seem that just remembering\n 528 00:31:51,570 --> 00:31:57,090 we can then represent really all of\n 529 00:31:57,090 --> 00:32:00,420 So that then is really\na laundry list of ways 530 00:32:00,421 --> 00:32:02,551 that we can just represent information. 531 00:32:02,550 --> 00:32:05,313 Again, computers or digital have\nall of these different formats 532 00:32:05,314 --> 00:32:07,980 but at the end of the day and as\nfancy as those devices in years 533 00:32:07,980 --> 00:32:11,760 are, it's just zeros and ones, tiny\n 534 00:32:11,760 --> 00:32:14,941 if you will, represented in some\nway and it's up to the software 535 00:32:14,941 --> 00:32:17,941 that you and I and others\nwrite to use those zeros 536 00:32:17,941 --> 00:32:21,421 and ones in ways we want to get\nthe computers to do something 537 00:32:22,861 --> 00:32:27,181 Questions, then, on this representation\n 538 00:32:27,181 --> 00:32:30,811 is ultimately what problem solving\n 539 00:32:30,810 --> 00:32:35,590 and producing new via\nsome process in between. 540 00:32:40,070 --> 00:32:43,999 AUDIENCE: Yeah, so we talked about how\n 541 00:32:43,999 --> 00:32:45,962 you to interpret information. 542 00:32:45,962 --> 00:32:50,873 How does a file format like .mp4\n 543 00:32:52,346 --> 00:32:53,971 DAVID MALAN: So a really good question. 544 00:32:53,971 --> 00:32:55,921 There are many other\nfile formats out there. 545 00:32:55,921 --> 00:32:58,730 You allude to MP4 for video\nand more generally the use 546 00:32:58,730 --> 00:33:01,340 are these things called\ncodecs and containers. 547 00:33:01,340 --> 00:33:04,910 It's not quite as simple when\nusing larger files, for instance 548 00:33:04,911 --> 00:33:08,449 in more modern formats that a\n 549 00:33:09,560 --> 00:33:13,160 If you stored that many images\nfor like a Hollywood movie 550 00:33:13,161 --> 00:33:17,431 like 24 or 30 of them per second,\n 551 00:33:17,431 --> 00:33:19,760 And if you've ever taken\nphotos on your phone 552 00:33:19,760 --> 00:33:23,780 you might know how many megabytes or\n 553 00:33:24,381 --> 00:33:27,771 So humans have developed over\nthe years a fancier software 554 00:33:27,770 --> 00:33:32,480 that uses much more math to represent\n 555 00:33:32,480 --> 00:33:35,240 just using somehow shorter\npatterns of zeros and ones 556 00:33:35,240 --> 00:33:37,830 than are most simplistic\nrepresentation here. 557 00:33:37,830 --> 00:33:40,160 And they use what might\nbe called compression. 558 00:33:40,161 --> 00:33:42,561 If you've ever used a zip\nfile or something else 559 00:33:42,560 --> 00:33:45,170 somehow your computer is\nusing fewer zeros and ones 560 00:33:45,171 --> 00:33:47,061 to represent the same\namount of information 561 00:33:47,060 --> 00:33:49,522 ideally without losing any information. 562 00:33:49,522 --> 00:33:52,730 In the world of multimedia, which we'll\n 563 00:33:52,730 --> 00:33:56,330 there are both lossy and\nlossless formats out there. 564 00:33:56,330 --> 00:33:59,961 Lossless means you lose\nno information whatsoever. 565 00:33:59,961 --> 00:34:05,361 But more commonly as you're alluding\n 566 00:34:05,361 --> 00:34:08,151 where you're actually throwing\naway some amount of quality. 567 00:34:08,150 --> 00:34:10,592 You're getting some amount\nof pixelation that might not 568 00:34:10,592 --> 00:34:13,550 look perfect to the human, but heck\n 569 00:34:14,541 --> 00:34:17,691 And in the world of multimedia,\n 570 00:34:17,690 --> 00:34:20,480 and other MPEG containers\nthat can combine 571 00:34:20,481 --> 00:34:24,561 different formats of video, different\n 572 00:34:24,561 --> 00:34:26,841 but there, too, do\ndesigners have discretion. 573 00:34:28,911 --> 00:34:32,311 Other questions, then, on\ninformation here as well? 574 00:34:32,811 --> 00:34:35,523 AUDIENCE: So I know\ncomputers used to be very big 575 00:34:35,523 --> 00:34:37,510 and taking up like a\nwhole room and stuff. 576 00:34:37,510 --> 00:34:41,545 Is the reason they've gotten\nsmaller because we can store 577 00:34:41,545 --> 00:34:43,826 this information piecemeal or what? 578 00:34:44,701 --> 00:34:47,659 I mean, back in the day you might\n 579 00:34:47,659 --> 00:34:50,340 tube, which is like some\nphysically large device that 580 00:34:50,340 --> 00:34:53,280 might have only stored some 0 or 1. 581 00:34:53,280 --> 00:34:55,920 Yes, it is the miniaturization\nof hardware these days 582 00:34:55,920 --> 00:35:00,721 that has allowed us to store as many\n 583 00:35:01,661 --> 00:35:03,583 And as we've built more\nfancy machines that 584 00:35:03,583 --> 00:35:06,000 can sort of design this hardware\nat an even smaller scale 585 00:35:06,001 --> 00:35:08,531 we're just packing more and\nmore into these devices. 586 00:35:08,530 --> 00:35:09,840 But there, too, is a trade off. 587 00:35:09,840 --> 00:35:13,110 For instance, you might know by\nusing your phone or your laptop 588 00:35:13,110 --> 00:35:15,810 for quite a while, maybe on\nyour lap, starts to get warm. 589 00:35:15,811 --> 00:35:17,941 So there are these literal\nphysical side effects 590 00:35:17,940 --> 00:35:20,370 of this where now some\nof our devices run hot. 591 00:35:20,371 --> 00:35:23,251 This is why like a data\ncenter in the real world 592 00:35:23,251 --> 00:35:25,591 might need more air conditioning\nthan a typical place 593 00:35:25,590 --> 00:35:28,712 because there are these\nphysical artifacts as well. 594 00:35:28,713 --> 00:35:31,921 In fact, if you'd like to see one of\n 595 00:35:31,920 --> 00:35:35,310 across the river here in now Allston\n 596 00:35:35,311 --> 00:35:40,621 is the Harvard Mark 1 computer that\n 597 00:35:41,981 --> 00:35:44,491 Well if we come back now\nto this first picture 598 00:35:44,490 --> 00:35:47,161 being computer science or\nreally problem solving 599 00:35:47,161 --> 00:35:49,110 I daresay we have more\nthan enough ways now 600 00:35:49,110 --> 00:35:53,190 to represent information, input and\n 601 00:35:53,190 --> 00:35:56,250 on something and thankfully all\nof those before us have given us 602 00:35:56,251 --> 00:35:57,961 things like ASCII and Unicode. 603 00:35:57,960 --> 00:36:01,150 Not to mention MP4s, word\ndocuments, and the like. 604 00:36:01,150 --> 00:36:05,250 But what's inside of this proverbial\n 605 00:36:05,251 --> 00:36:06,998 going in the outputs are coming? 606 00:36:06,998 --> 00:36:09,539 Well that's where we get this\nterm you might have heard, too. 607 00:36:09,539 --> 00:36:14,670 An algorithm, which is just step-by-step\n 608 00:36:14,670 --> 00:36:17,940 incarnated in the world\nof computers by software. 609 00:36:17,940 --> 00:36:20,910 When you write software\naka programs, you 610 00:36:20,911 --> 00:36:26,099 are implementing one or more algorithms,\n 611 00:36:26,099 --> 00:36:29,380 for solving some problem, and maybe\n 612 00:36:29,380 --> 00:36:31,422 but at the end of the day,\nno matter the language 613 00:36:31,422 --> 00:36:34,410 you use the computer is going\nto represent what you type 614 00:36:37,769 --> 00:36:40,260 So what might be a\nrepresentative algorithm? 615 00:36:40,260 --> 00:36:42,630 Nowadays you might use\nyour phone quite a bit 616 00:36:42,630 --> 00:36:45,480 to make calls or send texts\nor emails and therefore you 617 00:36:45,481 --> 00:36:48,001 have a whole bunch of\ncontacts in your address book. 618 00:36:48,001 --> 00:36:50,251 Nowadays, of course,\nthis is very digital 619 00:36:50,251 --> 00:36:53,341 but whether on iOS or\nAndroid or the like 620 00:36:53,340 --> 00:36:56,010 you might have a whole\nbunch of names, first name 621 00:36:56,010 --> 00:36:58,942 and/or last, as well as numbers\nand emails and the like. 622 00:36:58,943 --> 00:37:01,651 You might be in the habit of like\n 623 00:37:01,650 --> 00:37:05,190 all of those names to find\nthe person you want to call. 624 00:37:05,190 --> 00:37:09,000 It's probably sorted alphabetically by\n 625 00:37:11,431 --> 00:37:16,764 This is frankly quite the same as\n 626 00:37:16,764 --> 00:37:18,181 when we just used a physical book. 627 00:37:18,181 --> 00:37:20,014 In this physical book\nmight be a whole bunch 628 00:37:20,014 --> 00:37:22,171 of names alphabetically\nsorted from left to right 629 00:37:22,170 --> 00:37:24,340 corresponding to a\nwhole bunch of numbers. 630 00:37:24,340 --> 00:37:27,000 So suppose that in this\nold Harvard phone book 631 00:37:27,001 --> 00:37:29,161 we want to search for John Harvard. 632 00:37:29,161 --> 00:37:31,441 We might of course start\nquite simply at the beginning 633 00:37:31,440 --> 00:37:36,070 here, looking at one page at a\ntime, and this is an algorithm. 634 00:37:36,070 --> 00:37:40,320 This is like literally step-by-step\nlooking for the solution 635 00:37:41,490 --> 00:37:43,710 In that sense, if John\nHarvard's in the phone book 636 00:37:43,710 --> 00:37:47,663 is this algorithm page-by-page\ncorrect, would you say? 637 00:37:49,121 --> 00:37:51,041 Like if John Harvard's\nin the phone book 638 00:37:51,041 --> 00:37:54,641 obviously I'm eventually going to get to\n 639 00:37:56,141 --> 00:37:58,340 Is it well designed, would you say? 640 00:37:58,840 --> 00:38:01,990 I mean this is going to take forever\n 641 00:38:01,990 --> 00:38:03,190 depending how this thing's sorted. 642 00:38:03,190 --> 00:38:04,930 All right, well let\nme go a little faster. 643 00:38:04,931 --> 00:38:06,431 I'll start like two pages at a time. 644 00:38:06,431 --> 00:38:11,050 2, 4, 6, 8, 10, 12, and so forth. 645 00:38:11,050 --> 00:38:13,721 Sounds faster, is faster, is it correct? 646 00:38:14,411 --> 00:38:16,421 DAVID MALAN: OK, why is it not correct? 647 00:38:17,235 --> 00:38:19,043 AUDIENCE: So if you're\nstarting on page 1 648 00:38:19,043 --> 00:38:22,085 you're only going odd number of pages,\n 649 00:38:23,630 --> 00:38:26,297 If I start on an odd number of\npages and I'm going two at a time 650 00:38:26,297 --> 00:38:28,230 I might miss pages in between. 651 00:38:28,231 --> 00:38:30,493 And if I therefore conclude\nwhen I get to the back 652 00:38:30,492 --> 00:38:33,201 of the book there was no John\n 653 00:38:33,201 --> 00:38:35,451 This would be again one of these bugs. 654 00:38:35,451 --> 00:38:39,260 But if I try a little harder,\nI feel like there's a solution. 655 00:38:39,260 --> 00:38:41,550 We don't have to completely\nthrow out this algorithm. 656 00:38:41,550 --> 00:38:44,240 I think we can probably go\nroughly twice as fast still. 657 00:38:44,240 --> 00:38:47,001 But what should we do\ninstead to fix this? 658 00:38:59,610 --> 00:39:02,985 So I think what many of us, most of us,\n 659 00:39:02,985 --> 00:39:05,610 these days, we might go roughly\nto the middle of the phone book 660 00:39:05,610 --> 00:39:06,902 just to kind of get us started. 661 00:39:06,902 --> 00:39:10,350 And now I'm looking down, I'm looking\n 662 00:39:10,351 --> 00:39:13,061 and it looks like I'm in the M section. 663 00:39:13,061 --> 00:39:15,759 So just to be clear,\nwhat should I do next? 664 00:39:21,657 --> 00:39:23,740 DAVID MALAN: OK, and\npresumably it is John Harvard 665 00:39:23,740 --> 00:39:24,949 would be to the left of this. 666 00:39:24,949 --> 00:39:27,931 So here's an opportunity to\nfiguratively and literally tear 667 00:39:27,931 --> 00:39:32,461 this particular problem in half,\nthrow half of the problem away. 668 00:39:32,460 --> 00:39:34,980 It's actually pretty easy\nif you just do it that way. 669 00:39:36,791 --> 00:39:41,141 But I've now just decreased the\n 670 00:39:41,141 --> 00:39:45,240 So if I started with 1,000 pages\nof phone numbers and names, now 671 00:39:46,590 --> 00:39:48,601 And already we haven't\nfound John Harvard 672 00:39:48,601 --> 00:39:50,751 but that's a big bite\nout of this problem. 673 00:39:50,751 --> 00:39:54,339 I do think it's correct because if\n 674 00:39:54,338 --> 00:39:56,130 he's definitely not\ngoing to be over there. 675 00:39:56,130 --> 00:40:00,150 I think if I repeat this again\n 676 00:40:00,150 --> 00:40:02,070 here I might have gone a little too far. 677 00:40:02,070 --> 00:40:03,581 Now I'm in like the E section. 678 00:40:03,581 --> 00:40:08,131 So let me tear the problem in half\n 679 00:40:08,130 --> 00:40:11,100 and again repeat, dividing\nand dividing and conquering 680 00:40:11,101 --> 00:40:14,550 until finally, presumably, I end\n 681 00:40:14,550 --> 00:40:18,090 book on which John Harvard's\nname either is or is not 682 00:40:18,090 --> 00:40:21,400 but because of the algorithm\nyou proposed, step by step 683 00:40:21,400 --> 00:40:24,550 I know that he's not in\nanything I discarded. 684 00:40:24,550 --> 00:40:28,201 So traumatic is that\nmight have been made out 685 00:40:28,201 --> 00:40:31,715 to be, it's actually just harnessing\n 686 00:40:31,715 --> 00:40:33,840 Indeed, this is what\nprogramming is all about, too. 687 00:40:33,840 --> 00:40:36,690 It's not about learning\na completely new world 688 00:40:36,690 --> 00:40:40,350 but really just how to harness intuition\n 689 00:40:40,351 --> 00:40:43,381 have and take naturally\nbut learning how to express 690 00:40:43,380 --> 00:40:45,900 them now more succinctly,\nmore precisely 691 00:40:45,900 --> 00:40:49,150 using things called\nprogramming languages. 692 00:40:49,150 --> 00:40:54,330 Why is an algorithm like that if I found\n 693 00:40:54,331 --> 00:40:56,820 just doing the first\none or even the second 694 00:40:56,820 --> 00:40:59,940 and maybe doubling back\nto check those even pages? 695 00:40:59,940 --> 00:41:01,830 Well let's just look\nat little charts here. 696 00:41:01,831 --> 00:41:04,291 Again, we don't have to get\ninto the nuances of numbers 697 00:41:04,291 --> 00:41:07,501 but if we've got like a chart\nhere, xy plot, on the x-axis 698 00:41:07,501 --> 00:41:09,431 here I claim as the size of the problem. 699 00:41:09,431 --> 00:41:12,441 So measured in the numbers\nof pages in the phone book. 700 00:41:12,440 --> 00:41:16,050 So the farther you go out here, the\n 701 00:41:16,050 --> 00:41:18,641 And here we have time\nto solve on the y-axis. 702 00:41:18,641 --> 00:41:21,331 So the higher you go\nup, the more time it's 703 00:41:21,331 --> 00:41:24,101 going to be taking to solve\nthat particular problem. 704 00:41:24,101 --> 00:41:27,751 So let's just arbitrarily say that\n 705 00:41:27,751 --> 00:41:31,511 like n pages, might be\nrepresented graphically like this. 706 00:41:31,510 --> 00:41:33,931 No matter the slope,\nit's a straight line 707 00:41:33,931 --> 00:41:36,360 because there's presumably\na one to one relationship 708 00:41:36,360 --> 00:41:40,440 between numbers of pages and number\n 709 00:41:40,951 --> 00:41:43,291 If the phone company adds\nanother page next year 710 00:41:43,291 --> 00:41:45,301 because some new people\nmove to town, that's 711 00:41:45,300 --> 00:41:47,161 going to require one\nadditional page for me. 712 00:41:49,110 --> 00:41:52,800 If, though, we use the second\nalgorithm, flawed though it was 713 00:41:52,800 --> 00:41:56,431 unless we double back a little bit\n 714 00:41:56,431 --> 00:41:59,280 that's too going to be a\nstraight line, but it's 715 00:41:59,280 --> 00:42:02,820 going to be a different slope because\n 716 00:42:02,820 --> 00:42:06,250 relationship because I'm\ngoing to pages at a time. 717 00:42:06,251 --> 00:42:10,981 So if the phone company adds\nanother page or another two pages 718 00:42:10,981 --> 00:42:13,008 that still only just one more step. 719 00:42:13,007 --> 00:42:15,090 You can see the difference\nif I kind of draw this. 720 00:42:15,090 --> 00:42:18,210 If this is the phone book in\nquestion, this number of pages 721 00:42:18,210 --> 00:42:20,910 it might take this many\nseconds on the yellow line 722 00:42:20,911 --> 00:42:24,661 to represent or to find\nsomeone like John Harvard. 723 00:42:24,661 --> 00:42:27,421 But of course on the first\nalgorithm, the red line 724 00:42:27,420 --> 00:42:29,760 it's literally going to\ntake twice as many steps. 725 00:42:29,760 --> 00:42:32,311 And what do the n here mean?\nn is the go-to variable 726 00:42:32,311 --> 00:42:36,251 for computer scientist or programmer\n 727 00:42:36,251 --> 00:42:38,671 So if the number of pages\nin the phone book is n 728 00:42:38,670 --> 00:42:41,580 the number of steps the second\nalgorithm would have taken 729 00:42:41,581 --> 00:42:44,266 would be in the worst case n over 2. 730 00:42:44,266 --> 00:42:46,271 Half as many because\nyou're going twice as fast. 731 00:42:46,271 --> 00:42:50,611 But the third algorithm, actually\nif you recall your logarithms 732 00:42:50,610 --> 00:42:52,320 looks a little something like this. 733 00:42:52,320 --> 00:42:54,750 There's a fundamentally\ndifferent relationship 734 00:42:54,751 --> 00:42:58,261 between the size of the problem and\n 735 00:42:58,260 --> 00:43:00,150 that technically is\nlog-based, too, again 736 00:43:00,150 --> 00:43:02,911 but it's really the\nshape that's different. 737 00:43:02,911 --> 00:43:06,331 The implication there is that if,\n 738 00:43:06,331 --> 00:43:08,911 two different towns here in\nMassachusetts, merge next year 739 00:43:08,911 --> 00:43:11,971 and there's just one phone\nbook that's twice as big 740 00:43:11,971 --> 00:43:14,670 no big deal for that\nthird and final algorithm. 741 00:43:15,181 --> 00:43:17,941 You just tear the problem\none more time in half 742 00:43:17,940 --> 00:43:21,360 taking one more byte,\nthat's it, not another 1,000 743 00:43:21,360 --> 00:43:23,760 bytes just to get to the solution. 744 00:43:23,760 --> 00:43:26,320 Put another way, you\ncan walk it way, way 745 00:43:26,320 --> 00:43:29,520 way out here to a much bigger\nphone book and ultimately 746 00:43:29,521 --> 00:43:32,551 that green line is barely\ngoing to have budged. 747 00:43:32,550 --> 00:43:35,880 So this then is just a way of\nnow formalizing and thinking 748 00:43:35,880 --> 00:43:42,400 about what the performance or\n 749 00:43:42,400 --> 00:43:46,070 Before we now make one more\n 750 00:43:46,070 --> 00:43:52,121 any questions then on this notion of\n 751 00:43:52,621 --> 00:43:56,193 AUDIENCE: How many phone\nbooks have you got? 752 00:43:56,193 --> 00:43:58,651 DAVID MALAN: (LAUGHING) A lot\nof phone books over the years 753 00:43:58,650 --> 00:44:02,040 and if you or your parents have any\n 754 00:44:02,041 --> 00:44:04,501 use them because they're hard to find. 755 00:44:13,007 --> 00:44:15,340 AUDIENCE: You could get Harry\nPotter as a guest speaker. 756 00:44:15,340 --> 00:44:16,473 DAVID MALAN: Sorry, say again. 757 00:44:16,474 --> 00:44:18,780 AUDIENCE: You could get Harry\nPotter as a guest speaker. 758 00:44:18,780 --> 00:44:19,800 DAVID MALAN: (LAUGHING) Oh, yeah. 759 00:44:20,300 --> 00:44:23,380 Then we'd have a little\nsomething more to use here. 760 00:44:23,380 --> 00:44:28,740 So now if we want to formalize\nfurther what it is we just did 761 00:44:28,740 --> 00:44:30,360 we can go ahead and introduce this. 762 00:44:30,360 --> 00:44:33,327 A form of code aka pseudocode. 763 00:44:33,327 --> 00:44:35,911 Pseudocode is not a specific\nlanguage, it's not like something 764 00:44:35,911 --> 00:44:38,851 we're about to start coding in, it's\n 765 00:44:38,851 --> 00:44:41,371 in English or any human\nlanguage succinctly 766 00:44:41,371 --> 00:44:45,331 correctly toward an end of getting\n 767 00:44:45,331 --> 00:44:47,701 So for instance, here\nmight be how we could 768 00:44:47,701 --> 00:44:51,331 formalize the code, the pseudocode\nfor that same algorithm. 769 00:44:51,331 --> 00:44:54,211 Step one was pick up the\nphone book, as I did. 770 00:44:54,210 --> 00:44:56,940 Step two might be open to\nthe middle of the phone book 771 00:44:56,940 --> 00:44:58,740 as you proposed that we do first. 772 00:44:58,740 --> 00:45:01,920 Step three was probably to\nlook down at the pages, I did. 773 00:45:01,920 --> 00:45:04,530 And step four gets a\nlittle more interesting 774 00:45:04,530 --> 00:45:07,770 because I had to quickly make a\n 775 00:45:07,771 --> 00:45:11,521 If person is on page, then\nI should probably just 776 00:45:11,521 --> 00:45:13,251 go ahead and call that person. 777 00:45:13,251 --> 00:45:15,751 But that probably wasn't the\ncase at least for John Harvard 778 00:45:17,851 --> 00:45:19,951 So there's this other\nquestion I should now 779 00:45:19,951 --> 00:45:22,590 ask else if the person\nis earlier in the book 780 00:45:22,590 --> 00:45:26,010 then I should tear the problem\nin half as I did but go left, so 781 00:45:26,010 --> 00:45:30,121 to speak, and then not just open to the\n 782 00:45:30,121 --> 00:45:33,300 but really just go back to\nstep three, repeat myself. 783 00:45:33,900 --> 00:45:37,650 Because I can just repeat what I\n 784 00:45:39,010 --> 00:45:41,753 But, if the person\nwas later in the book 785 00:45:41,753 --> 00:45:44,461 as might have happened with a\n 786 00:45:44,460 --> 00:45:47,085 then I should open to the middle\nof the right half of the book 787 00:45:47,085 --> 00:45:49,075 again go back to line\nthree, but again, I'm 788 00:45:49,076 --> 00:45:51,451 not going to get sucked doing\nsomething forever like this 789 00:45:51,451 --> 00:45:54,521 because I keep shrinking\nthe size of the problem. 790 00:45:54,521 --> 00:45:56,431 Lastly, the only\npossible scenario that's 791 00:45:56,431 --> 00:45:59,905 left, if John Harvard is not on\n 792 00:45:59,905 --> 00:46:02,280 and he's not to the right,\nwhat should our conclusion be? 793 00:46:03,322 --> 00:46:04,489 DAVID MALAN: He's not there. 794 00:46:05,391 --> 00:46:08,240 So we need to quit in some other form. 795 00:46:08,240 --> 00:46:11,751 Now as an aside, it's kind of deliberate\n 796 00:46:11,751 --> 00:46:16,021 at the end because this is what\n 797 00:46:16,021 --> 00:46:17,990 whether you're new at\nit or professional 798 00:46:17,990 --> 00:46:22,201 just not considering all possible\n 799 00:46:22,201 --> 00:46:25,701 that might not happen that often,\n 800 00:46:25,701 --> 00:46:28,550 in your own code,\npseudocode or otherwise 801 00:46:28,550 --> 00:46:31,101 this is when and why\nprograms might crash 802 00:46:31,101 --> 00:46:34,111 or you might say stupid little\n 803 00:46:34,110 --> 00:46:35,360 or your computer might reboot. 804 00:46:35,931 --> 00:46:38,780 It's doing something\nsort of unpredictable 805 00:46:38,780 --> 00:46:42,411 if a human, maybe myself,\ndidn't anticipate this. 806 00:46:42,411 --> 00:46:45,501 Like what does this program do if\n 807 00:46:45,501 --> 00:46:48,109 if I had omitted lines 12 and 13? 808 00:46:48,650 --> 00:46:50,650 Maybe it would behave\ndifferently on a Mac or PC 809 00:46:50,650 --> 00:46:53,330 because it's sort of undefined behavior. 810 00:46:53,331 --> 00:46:56,510 These are the kinds of omissions\nthat frankly you're invariably 811 00:46:56,510 --> 00:46:58,641 going to make, bugs\nyou're going to introduce 812 00:46:58,641 --> 00:47:02,751 mistakes you're going to make early\n 813 00:47:02,751 --> 00:47:06,111 But you'll get better at\nthinking about those corner cases 814 00:47:06,110 --> 00:47:09,030 and handling anything that can\n 815 00:47:09,030 --> 00:47:11,931 your code will be all the better for it. 816 00:47:11,931 --> 00:47:15,262 Now the problem ultimately\nwith learning how to program 817 00:47:15,262 --> 00:47:16,971 especially if you've\nnever had experience 818 00:47:16,971 --> 00:47:21,411 or even if you do but you\nlearned one language only 819 00:47:21,411 --> 00:47:25,130 is that they all look a little\ncryptic at first glance. 820 00:47:25,130 --> 00:47:27,260 But they do share certain commonalities. 821 00:47:27,260 --> 00:47:30,141 In fact, we'll use this\npseudocode to define those first. 822 00:47:30,141 --> 00:47:32,360 Highlighted in yellow\nhere are what henceforth 823 00:47:32,360 --> 00:47:34,520 we're going to start calling functions. 824 00:47:34,521 --> 00:47:37,641 Lots of different programming\nlanguages exist, but most of them 825 00:47:37,641 --> 00:47:40,820 have what we might call\nfunctions, which are actions 826 00:47:40,820 --> 00:47:43,797 or verbs that solve\nsome smaller problem. 827 00:47:43,797 --> 00:47:46,130 That is to say, you might use\na whole bunch of functions 828 00:47:46,130 --> 00:47:50,240 to solve a bigger problem\nbecause each function tends to do 829 00:47:50,240 --> 00:47:52,851 something very specific or precise. 830 00:47:52,851 --> 00:47:57,141 These then in English might be\n 831 00:47:57,141 --> 00:47:59,360 code, to these things called functions. 832 00:47:59,360 --> 00:48:02,931 Highlighted in yellow now are\nwhat we might call conditionals. 833 00:48:02,931 --> 00:48:05,751 Conditionals are things\nthat you do conditionally 834 00:48:05,751 --> 00:48:07,443 based on the answer to some question. 835 00:48:07,443 --> 00:48:09,651 You can think of them kind\nof like forks in the road. 836 00:48:09,650 --> 00:48:12,350 Do you go left or go right\nor some other direction 837 00:48:12,351 --> 00:48:14,809 based on the answer to some question? 838 00:48:14,809 --> 00:48:16,101 Well, what are those questions? 839 00:48:16,101 --> 00:48:20,181 Highlighted now in yellow or what we\n 840 00:48:20,181 --> 00:48:25,070 after a mathematician last name Bool,\n 841 00:48:25,070 --> 00:48:31,130 Or, if you prefer, true or false answers\n 842 00:48:31,130 --> 00:48:34,310 We just need to distinguish\none scenario from another. 843 00:48:34,311 --> 00:48:37,101 The last thing manifests\nin this pseudocode 844 00:48:37,101 --> 00:48:39,740 is what I might highlight\nnow and call loops. 845 00:48:39,740 --> 00:48:43,521 Some kind of cycle, some kind of\n 846 00:48:43,521 --> 00:48:48,861 again and again so that I don't\n 847 00:48:48,860 --> 00:48:53,030 a 1,000-page phone book, I can get\n 848 00:48:53,030 --> 00:48:58,161 of repeat myself inherently in order to\n 849 00:48:59,221 --> 00:49:01,581 So this then is what we\nmight call pseudocode 850 00:49:01,581 --> 00:49:05,151 and indeed there are\nother characteristics 851 00:49:05,150 --> 00:49:08,540 of programs that we'll touch on before\n 852 00:49:08,541 --> 00:49:13,740 values, variables, and more, but\n 853 00:49:13,740 --> 00:49:16,371 including some we will very\ndeliberately use in this class 854 00:49:16,371 --> 00:49:19,760 and that everyone in the real\nworld these days still uses 855 00:49:19,760 --> 00:49:22,201 its programs tend to look like this. 856 00:49:22,201 --> 00:49:24,860 This for instance, is a distillation\nof that very first program 857 00:49:24,860 --> 00:49:29,810 I wrote in 1996 in CS50 itself just\n 858 00:49:29,811 --> 00:49:34,731 In fact, this version here just tries\n 859 00:49:34,731 --> 00:49:37,731 Which is, dare say, the most\ncanonical first thing that most 860 00:49:37,731 --> 00:49:41,601 any programmer ever gets a\ncomputer to say just because 861 00:49:42,900 --> 00:49:45,920 I mean, there's a hash symbol,\n 862 00:49:45,920 --> 00:49:49,760 words like int, curly braces, quotes,\n 863 00:49:50,420 --> 00:49:54,020 I mean there's more overhead\nand more syntax and clutter 864 00:49:54,021 --> 00:49:55,941 than there is an actual idea. 865 00:49:55,940 --> 00:50:00,080 Now that's not to say that you won't\n 866 00:50:00,081 --> 00:50:03,441 because honestly there's not that many\n 867 00:50:03,440 --> 00:50:07,701 have typically a much smaller vocabulary\n 868 00:50:07,701 --> 00:50:10,581 but at first it might\nindeed look quite cryptic. 869 00:50:10,581 --> 00:50:14,361 But you can perhaps infer I have no\n 870 00:50:14,360 --> 00:50:18,170 but "Hello, world." is\npresumably quote unquote what 871 00:50:18,170 --> 00:50:19,860 will be printed on the screen. 872 00:50:19,860 --> 00:50:22,880 But what we'll do today,\nafter a short break 873 00:50:22,880 --> 00:50:24,980 and set the stage for\nnext week is introduce 874 00:50:24,981 --> 00:50:27,336 these exact same ideas in\njust a bit using Scratch 875 00:50:27,335 --> 00:50:29,210 something that you\nyourselves might have used 876 00:50:29,210 --> 00:50:32,630 when you're quite younger but\n 877 00:50:34,041 --> 00:50:38,421 The upside of what we'll soon do using\n 878 00:50:38,420 --> 00:50:41,960 language from our friends down the\n 879 00:50:41,960 --> 00:50:46,490 start to drag and drop things that\n 880 00:50:46,490 --> 00:50:48,471 together if it makes\nlogical sense to do so 881 00:50:48,471 --> 00:50:51,710 but without the distraction\nof hashes, parentheses 882 00:50:51,710 --> 00:50:54,410 curly braces, angle brackets,\nsemicolons, and things 883 00:50:54,411 --> 00:50:56,311 that are quite beside the point. 884 00:50:56,311 --> 00:50:58,800 But for now, let's go ahead\nand take a 10 minute break here 885 00:50:58,800 --> 00:51:01,101 and when we resume, we\nwill start programming. 886 00:51:01,101 --> 00:51:04,561 So this on the screen\nis a language called 887 00:51:04,561 --> 00:51:07,921 C something that will dive\ninto next week and thankfully 888 00:51:07,920 --> 00:51:10,980 this now on the screen is\nanother language called Python 889 00:51:10,981 --> 00:51:13,921 that we'll also take a look at\nin a few weeks before long along 890 00:51:13,920 --> 00:51:15,690 with other languages along the way. 891 00:51:15,690 --> 00:51:19,640 Today though, and for this first\nweek, week zero, so to speak 892 00:51:19,641 --> 00:51:21,391 we use Scratch because\nagain it will allow 893 00:51:21,391 --> 00:51:23,971 us to explore some of those\nprogramming fundamentals 894 00:51:23,971 --> 00:51:28,471 that will be in C and in Python and in\n 895 00:51:28,471 --> 00:51:32,521 but in a way where we don't have to\n 896 00:51:32,521 --> 00:51:35,351 So the world of Scratch looks like this. 897 00:51:35,351 --> 00:51:38,070 It's a web-based or downloadable\nprogramming environment 898 00:51:38,070 --> 00:51:40,920 that has this layout here\nby default. On the left here 899 00:51:40,920 --> 00:51:45,000 we'll soon see is a palette of puzzle\n 900 00:51:45,001 --> 00:51:47,521 represent all of those\nideas we just discussed. 901 00:51:47,521 --> 00:51:50,071 And by dragging and\ndropping these puzzle pieces 902 00:51:50,070 --> 00:51:54,181 or blocks over this big area\nand connecting them together 903 00:51:54,181 --> 00:51:56,251 if it makes logical\nsense to do so, we'll 904 00:51:56,251 --> 00:51:58,531 start programming in this environment. 905 00:51:58,530 --> 00:52:01,561 The environment allows you to have\n 906 00:52:01,561 --> 00:52:04,111 Multiple characters, things\nlike a cat or anything 907 00:52:04,110 --> 00:52:08,310 else, and those sprites exist\nin this rectangular world 908 00:52:08,311 --> 00:52:11,521 up here that you can full screen to\n 909 00:52:11,521 --> 00:52:16,201 is Scratch, who can move up, down, left,\n 910 00:52:16,201 --> 00:52:19,291 Within its Scratch's\nworld you can think of it 911 00:52:19,291 --> 00:52:23,251 as perhaps a familiar\ncoordinate system with Xs and Ys 912 00:52:23,251 --> 00:52:27,001 which is helpful only when it comes\n 913 00:52:27,001 --> 00:52:32,281 Right now Scratch is at the default,\n 914 00:52:32,280 --> 00:52:36,061 If you were to move the cat way\n 915 00:52:37,891 --> 00:52:40,651 If you move the cat all the way\n 916 00:52:40,650 --> 00:52:42,940 but y would now be negative 180. 917 00:52:42,940 --> 00:52:47,460 And if you went left, x would become\n 918 00:52:47,460 --> 00:52:51,690 or to the right x would be\n240 and y would stay zero. 919 00:52:51,690 --> 00:52:55,590 So those numbers generally don't\n 920 00:52:55,590 --> 00:52:58,170 move relatively in this\nworld up, down, left, right 921 00:52:58,170 --> 00:53:01,230 but when it comes time\nto precisely position 922 00:53:01,231 --> 00:53:03,601 some of these sprites\nor other imagery, it'll 923 00:53:03,601 --> 00:53:07,050 be helpful just to have that mental\n 924 00:53:07,050 --> 00:53:10,150 Well let's go ahead and make perhaps\n 925 00:53:10,150 --> 00:53:13,050 I'm going to switch over to the\nsame programming environment 926 00:53:13,050 --> 00:53:15,971 now for a tour of the left hand side. 927 00:53:15,971 --> 00:53:20,911 So by default selected here are\nthe category in blue motion 928 00:53:20,911 --> 00:53:24,451 which has a whole bunch of puzzle\n 929 00:53:24,451 --> 00:53:27,001 And whereas Scratch as\na graphical language 930 00:53:27,001 --> 00:53:30,688 categorizes things by the type\nof things that these pieces do 931 00:53:30,688 --> 00:53:32,521 we'll see that throughout\nthis whole palette 932 00:53:32,521 --> 00:53:35,461 we'll have functions and\nvariables and conditionals 933 00:53:35,460 --> 00:53:38,650 and Boolean expressions and more\n 934 00:53:38,650 --> 00:53:42,181 So for instance, moving 10 steps\nor turning one way or the other 935 00:53:42,181 --> 00:53:45,691 would be functions categorized\nhere as things like motion. 936 00:53:45,690 --> 00:53:49,257 Under looks in purple, you\nmight have speech bubbles 937 00:53:49,257 --> 00:53:51,090 that you can create by\ndragging and dropping 938 00:53:51,090 --> 00:53:54,360 these that might say "hello" or\n 939 00:53:54,360 --> 00:53:58,320 Or you could switch costumes, change\n 940 00:54:01,320 --> 00:54:05,010 You can play sounds like "meow" or\n 941 00:54:06,001 --> 00:54:09,271 Then there's these things Scratch calls\n 942 00:54:09,271 --> 00:54:11,191 is the first, when green flag clicked. 943 00:54:11,190 --> 00:54:14,221 Because if we look over to the\nright of Scratch's world here 944 00:54:14,221 --> 00:54:17,670 this rectangular region has\nthis green flag and red stop 945 00:54:17,670 --> 00:54:20,880 sign up above, one of which is\nfor Play one of which is for Stop 946 00:54:20,880 --> 00:54:24,510 and so that's going to allow us to\n 947 00:54:24,510 --> 00:54:27,840 when that green flag\nis initially clicked. 948 00:54:27,840 --> 00:54:31,860 But you can listen for other types of\n 949 00:54:31,860 --> 00:54:35,911 or something else, when this sprite\n 950 00:54:35,911 --> 00:54:39,570 Here you already see like a\nprogrammer's incarnation of things 951 00:54:39,570 --> 00:54:42,451 you and I take for granted like\nevery day now on our phones. 952 00:54:42,451 --> 00:54:46,391 Any time you tap an icon or drag your\n 953 00:54:46,391 --> 00:54:48,550 These are what a programmer\nwould call events 954 00:54:48,550 --> 00:54:51,451 things that happen and\nare often triggered by us 955 00:54:51,451 --> 00:54:55,471 humans and things that a program\nbe it in Scratch or Python 956 00:54:55,471 --> 00:54:59,280 or C or anything else can\nlisten for and respond to. 957 00:54:59,280 --> 00:55:01,980 Indeed, that's why when you tap\nthe phone icon on your phone 958 00:55:01,981 --> 00:55:04,441 the phone application\nstarts up because someone 959 00:55:04,440 --> 00:55:08,640 wrote software that's listening for a\n 960 00:55:08,641 --> 00:55:10,711 So Scratch has these same things, too. 961 00:55:10,710 --> 00:55:13,590 Under Control in orange,\nyou can see that we 962 00:55:13,590 --> 00:55:15,721 can wait for one second\nor repeat something 963 00:55:15,721 --> 00:55:17,670 some number of times,\n10 by default, but we 964 00:55:17,670 --> 00:55:20,800 can change anything in these\nwhite circles to anything else. 965 00:55:20,800 --> 00:55:22,710 There's another puzzle\npiece here forever 966 00:55:22,710 --> 00:55:25,590 which implies some kind of loop where\n 967 00:55:25,590 --> 00:55:27,060 Even though it seems a\nlittle tight, there's 968 00:55:27,061 --> 00:55:29,131 not much room to fit\nsomething there, Scratch 969 00:55:29,130 --> 00:55:31,005 is going to have these\nthings grow and shrink 970 00:55:31,005 --> 00:55:33,993 however we want to fill\nsimilarly shaped pieces. 971 00:55:33,994 --> 00:55:35,161 Here are those conditionals. 972 00:55:35,161 --> 00:55:40,061 If something is true or false,\nthen do this next thing. 973 00:55:40,061 --> 00:55:42,841 And that's how we can put in\nthis little trapezoid-like shape. 974 00:55:42,840 --> 00:55:46,770 Some form of Boolean expression, a\n 975 00:55:46,771 --> 00:55:50,131 or one/zero answer and decide\nwhether to do something or not. 976 00:55:50,130 --> 00:55:52,440 You can combine these things, too. 977 00:55:52,440 --> 00:55:55,905 If something is true, do this,\nelse do this other thing. 978 00:55:55,905 --> 00:55:57,780 And you can even tuck\none inside of the other 979 00:55:57,780 --> 00:56:01,320 if you want to ask three\nor four or more questions. 980 00:56:01,320 --> 00:56:03,010 Sensing, too, is going to be a thing. 981 00:56:03,010 --> 00:56:07,871 You can ask questions aka Boolean\n 982 00:56:07,871 --> 00:56:10,161 the mouse pointer, the\narrow on the screen? 983 00:56:10,161 --> 00:56:12,800 So that you can start to\ninteract with these programs. 984 00:56:12,800 --> 00:56:15,298 What is the distance between\na sprite and a mouse pointer? 985 00:56:15,298 --> 00:56:17,590 You can do simple calculations\njust to figure out maybe 986 00:56:17,590 --> 00:56:20,141 if the enemy is getting\nclose to the cat. 987 00:56:20,141 --> 00:56:23,651 Under Operator some lower level\n 988 00:56:23,650 --> 00:56:25,850 to pick random numbers,\nwhich for a game is great 989 00:56:25,851 --> 00:56:27,851 because then you can kind\nof vary the difficulty 990 00:56:27,851 --> 00:56:30,309 or what's happening in a game\nwithout the same game playing 991 00:56:33,550 --> 00:56:37,520 Something and something must be true\n 992 00:56:38,021 --> 00:56:40,240 Or we can even join two words together. 993 00:56:40,240 --> 00:56:43,181 Says apple and banana by default,\n 994 00:56:43,181 --> 00:56:46,990 whatever you want there to\ncombine multiple words into full 995 00:56:48,581 --> 00:56:52,361 Then lastly down here, there's in\n 996 00:56:52,360 --> 00:56:54,911 In math we've obviously\ngot x and y and whatnot. 997 00:56:54,911 --> 00:56:56,921 In programming we'll\nhave the same ability 998 00:56:56,920 --> 00:57:03,010 to store in these named symbols,\n 999 00:57:03,010 --> 00:57:06,701 Numbers or letters or words or\ncolors or anything, ultimately. 1000 00:57:06,701 --> 00:57:09,971 But in programming you'll see that\n 1001 00:57:09,971 --> 00:57:13,391 use simple letters like x\nand y and z, but to actually 1002 00:57:13,391 --> 00:57:20,291 give variables full singular or plural\n 1003 00:57:20,291 --> 00:57:24,131 Then lastly, if this isn't\nenough color blocks for you 1004 00:57:24,130 --> 00:57:25,960 you can create your own blocks. 1005 00:57:25,960 --> 00:57:29,050 Indeed, this is going to be a\n 1006 00:57:29,050 --> 00:57:32,920 and with the first problem set whereby\n 1007 00:57:32,920 --> 00:57:37,660 pieces and you realize, oh, would have\n 1008 00:57:37,661 --> 00:57:40,990 have just been replaced by one\nhad MIT thought to give me that 1009 00:57:40,990 --> 00:57:44,291 one puzzle piece, you yourself\ncan make your own blocks 1010 00:57:44,291 --> 00:57:47,111 by connecting these all together,\ngiving them a name, and boom 1011 00:57:47,110 --> 00:57:49,250 a new puzzle piece will exist. 1012 00:57:49,251 --> 00:57:51,581 So let's do the simplest,\nmost canonical programs 1013 00:57:51,581 --> 00:57:53,530 here, starting up with\ncontrol, and I'm going 1014 00:57:53,530 --> 00:57:57,880 to click and drag and drop this\n 1015 00:57:57,880 --> 00:58:01,107 Then I'm going to grab one\nmore, for instance under Looks 1016 00:58:01,108 --> 00:58:03,191 and under Looks I'm going\nto go ahead and just say 1017 00:58:03,190 --> 00:58:07,900 something like initially not\njust Hello but the more canonical 1018 00:58:09,550 --> 00:58:12,250 Now you might guess that in\nthis programming environment 1019 00:58:12,251 --> 00:58:15,611 I can go over here now and\nclick the green flag and voila 1020 00:58:17,090 --> 00:58:19,005 So that's my first\nprogram and obviously much 1021 00:58:19,005 --> 00:58:21,880 more user friendly than typing out\n 1022 00:58:21,880 --> 00:58:25,040 saw on the screen that you,\ntoo, will type out next week. 1023 00:58:25,041 --> 00:58:28,280 But for now, we'll just focus on\n 1024 00:58:28,280 --> 00:58:29,891 So what it is that just happened? 1025 00:58:29,891 --> 00:58:32,391 This purple block here is\nSay, that's the function 1026 00:58:32,391 --> 00:58:37,090 and it seems to take some form of input\n 1027 00:58:37,996 --> 00:58:40,121 Well this actually fits\nthe paradigm that we looked 1028 00:58:40,121 --> 00:58:42,445 at earlier of just inputs and outputs. 1029 00:58:42,445 --> 00:58:45,340 So if I may, if you consider\nwhat this puzzle piece is doing 1030 00:58:47,050 --> 00:58:51,010 The input in this case is going\n 1031 00:58:51,010 --> 00:58:55,346 The algorithm is going to be implemented\n 1032 00:58:55,346 --> 00:58:57,971 and the output of that is going\nto be some kind of side effect 1033 00:58:57,971 --> 00:59:01,271 like the cat and the speech\nbubble are saying Hello, world. 1034 00:59:01,271 --> 00:59:03,760 So already even that\nsimple drag and drop 1035 00:59:03,760 --> 00:59:07,340 mimics exactly this relatively\nsimple mental model. 1036 00:59:07,340 --> 00:59:08,750 So let's take things further. 1037 00:59:08,751 --> 00:59:11,923 Let's go ahead now and make the\n 1038 00:59:11,922 --> 00:59:14,380 that it says something like\nHello, David, or Hello, Carter 1039 00:59:14,380 --> 00:59:16,600 or Hello to you specifically. 1040 00:59:16,601 --> 00:59:18,556 And for this, I'm going\nto go under Sensing. 1041 00:59:18,556 --> 00:59:21,431 And you might have to poke around\n 1042 00:59:21,431 --> 00:59:24,472 around, but I've done this a few times\n 1043 00:59:26,860 --> 00:59:28,752 Ask what's your name,\nbut that's in white 1044 00:59:28,753 --> 00:59:30,461 so we can change the\nquestion to anything 1045 00:59:30,460 --> 00:59:34,930 we want, and it's going to wait for\n 1046 00:59:34,931 --> 00:59:37,510 This function called Ask\nis a little different 1047 00:59:37,510 --> 00:59:41,320 from the Say block, which just had\n 1048 00:59:42,460 --> 00:59:47,080 The ask function is even more powerful\n 1049 00:59:47,831 --> 00:59:50,831 This function is going\nto hand you back what 1050 00:59:50,831 --> 00:59:55,091 they typed in in the form of\nwhat's called a return value, which 1051 00:59:55,090 --> 00:59:57,940 is stored ultimately and by\ndefault this thing called Answer. 1052 00:59:57,940 --> 01:00:00,612 This little blue oval here\ncalled Answer is again 1053 01:00:00,612 --> 01:00:02,320 one of these variables\nthat in math would 1054 01:00:02,320 --> 01:00:05,828 be called just x or y but in\n 1055 01:00:05,829 --> 01:00:07,371 So I'm going to go ahead and do this. 1056 01:00:07,371 --> 01:00:09,204 Let me go ahead and\ndrag and drop this block 1057 01:00:09,204 --> 01:00:11,631 and I want to ask the question\nbefore saying anything 1058 01:00:11,630 --> 01:00:13,630 but you'll notice that\nScratch is smart and it's 1059 01:00:13,630 --> 01:00:15,641 going to realize I want to\ninsert something in between 1060 01:00:15,641 --> 01:00:17,599 and it's just going to\nmove things up and down. 1061 01:00:17,599 --> 01:00:20,530 I'm going to let go and ask the\n 1062 01:00:20,530 --> 01:00:23,770 And now if I want to go ahead\nand say hello, David or Carter 1063 01:00:23,771 --> 01:00:26,141 let's just do Hello\ncomma, because I obviously 1064 01:00:26,141 --> 01:00:28,820 don't know when I'm writing the\nprogram who's going to use it. 1065 01:00:28,820 --> 01:00:35,141 So let me now grab another looks block\n 1066 01:00:35,141 --> 01:00:39,280 let me go back to Sensing and now\n 1067 01:00:39,280 --> 01:00:42,610 by this other puzzle piece, and\n 1068 01:00:42,610 --> 01:00:45,460 Notice it's the same shape, even\n 1069 01:00:45,460 --> 01:00:47,650 Things will grow or shrink as needed. 1070 01:00:47,650 --> 01:00:49,721 All right, so let's now zoom out. 1071 01:00:49,721 --> 01:00:52,630 Let me go and stop the old version\n 1072 01:00:53,291 --> 01:00:55,751 Let me hit the green\nflag and what's my name? 1073 01:00:59,338 --> 01:01:01,880 All right, maybe I just wasn't\npaying close enough attention. 1074 01:01:03,141 --> 01:01:06,360 Green flag, D-A-V-I-D, Enter. 1075 01:01:09,280 --> 01:01:13,300 What's the bug or\nmistake might you think? 1076 01:01:14,150 --> 01:01:17,370 AUDIENCE: Do you need to somehow add\n 1077 01:01:17,371 --> 01:01:20,760 DAVID MALAN: Yeah, we kind of want\n 1078 01:01:20,760 --> 01:01:23,940 And it's technically a bug because\n 1079 01:01:23,940 --> 01:01:26,910 It's just saying David\nafter I asked for my name. 1080 01:01:26,911 --> 01:01:29,971 I'd like it to say\nmaybe Hello then David 1081 01:01:29,971 --> 01:01:32,460 but it's just blowing past\nthe Hello and printing David. 1082 01:01:32,460 --> 01:01:34,780 But let's put our finger\non why this is happening. 1083 01:01:34,780 --> 01:01:38,190 You're right for the solution, but\n 1084 01:01:39,228 --> 01:01:42,706 AUDIENCE: So it says hello,\nbut it gets to that last step 1085 01:01:42,706 --> 01:01:43,925 so quickly you can't see it. 1086 01:01:44,800 --> 01:01:47,440 I mean, computers are\nreally darn fast these days. 1087 01:01:47,440 --> 01:01:50,350 It is saying Hello, all of us\nare just too slow in this room 1088 01:01:50,351 --> 01:01:54,740 to even see it because it's then saying\n 1089 01:01:54,740 --> 01:01:57,443 So there's a couple of solutions\nhere, and yours is spot on 1090 01:01:57,443 --> 01:01:59,651 but just to poke around,\nyou'll see the first example 1091 01:01:59,650 --> 01:02:03,130 of how many ways in programming be\n 1092 01:02:03,130 --> 01:02:05,440 else, that there are going\nto be to solve problems? 1093 01:02:05,440 --> 01:02:07,420 We'll teach you over the\ncourse of these weeks 1094 01:02:07,420 --> 01:02:10,660 sometimes some ways are\nbetter relatively than others 1095 01:02:10,661 --> 01:02:13,811 but rarely is there a\nbest way necessarily 1096 01:02:13,811 --> 01:02:15,728 because again reasonable\npeople will disagree. 1097 01:02:15,728 --> 01:02:17,936 And what we'll try to teach\nyou over the coming weeks 1098 01:02:17,936 --> 01:02:20,111 is how to kind of think\nthrough those nuances. 1099 01:02:20,110 --> 01:02:22,257 And it's not going to be\nobvious at first glance 1100 01:02:22,257 --> 01:02:24,340 but the more programs you\nwrite, the more feedback 1101 01:02:24,340 --> 01:02:26,951 you get, the more bugs\nthat you introduce 1102 01:02:26,951 --> 01:02:30,860 the more you'll get your footing with\n 1103 01:02:30,860 --> 01:02:33,130 So let me try this in a couple of ways. 1104 01:02:33,130 --> 01:02:35,620 Up here would be one\nsolution to the problem. 1105 01:02:35,621 --> 01:02:40,286 MIT anticipated this kind of issue,\n 1106 01:02:40,286 --> 01:02:42,161 and I could just use a\npuzzle piece that says 1107 01:02:42,161 --> 01:02:44,771 say the following for\ntwo seconds or one second 1108 01:02:44,771 --> 01:02:47,381 or whatever, then do the\nsame with the next word 1109 01:02:47,380 --> 01:02:50,110 and it might be kind\nof a bit of a pause 1110 01:02:50,110 --> 01:02:54,610 Hello, one second, two seconds, David,\n 1111 01:02:54,610 --> 01:02:56,760 it would look a little\nmore grammatically correct. 1112 01:02:56,760 --> 01:02:59,260 But I can do it a little more\nelegantly, as you've proposed. 1113 01:02:59,260 --> 01:03:01,385 Let me go ahead and throw\naway one of these blocks 1114 01:03:01,385 --> 01:03:04,181 and you can just drag and let\ngo and it'll delete itself. 1115 01:03:04,181 --> 01:03:10,211 Let me go down to Operators because\n 1116 01:03:10,210 --> 01:03:13,721 So even if you're not sure what goes\n 1117 01:03:17,300 --> 01:03:20,170 Let me go ahead and\nsay hello comma space. 1118 01:03:20,170 --> 01:03:22,330 Now it could just say by\ndefault Hello, banana 1119 01:03:22,331 --> 01:03:27,701 but let me go back to\nSensing, Drag answer 1120 01:03:27,701 --> 01:03:29,501 and that's going to drag and drop there. 1121 01:03:29,501 --> 01:03:34,161 So now notice we're sort of stacking\n 1122 01:03:34,161 --> 01:03:38,021 so that the output of one becomes the\n 1123 01:03:38,021 --> 01:03:42,024 Let me go ahead and zoom\nout, hit Stop, and hit Play. 1124 01:03:42,023 --> 01:03:43,190 All right, what's your name? 1125 01:03:45,550 --> 01:03:48,770 Now it's presumably\nas we first intended. 1126 01:03:57,811 --> 01:04:02,728 So consider that even with\nthis additional example 1127 01:04:02,728 --> 01:04:05,811 it still fits the same mental model,\n 1128 01:04:05,811 --> 01:04:08,961 Here's that new function\nAsk something and wait. 1129 01:04:08,960 --> 01:04:12,471 And notice that in this case too\n 1130 01:04:12,471 --> 01:04:15,621 henceforth as an argument\nor a parameter, programming 1131 01:04:15,621 --> 01:04:18,681 speak for just an input in\nthe context of a function. 1132 01:04:18,681 --> 01:04:21,831 If we use our drawing as before\nto represent this thing here 1133 01:04:21,831 --> 01:04:26,061 we'll see that the input now is going\n 1134 01:04:26,061 --> 01:04:30,081 The algorithm is going to be implemented\n 1135 01:04:30,081 --> 01:04:33,381 the function called Ask, and the\noutput of that thing this time 1136 01:04:33,380 --> 01:04:36,110 is not going to be the\ncat saying anything yet 1137 01:04:36,110 --> 01:04:39,721 but rather it's going\nto be the actual answer. 1138 01:04:39,721 --> 01:04:43,130 So instead of the visual side effect\n 1139 01:04:43,130 --> 01:04:45,620 now nothing visible is happening yet. 1140 01:04:45,621 --> 01:04:49,671 Thanks to this function it's sort of\n 1141 01:04:49,670 --> 01:04:55,701 with whatever I typed in written on it\n 1142 01:04:57,951 --> 01:05:00,050 Now what did I then do with that value? 1143 01:05:00,050 --> 01:05:04,820 Well consider that with\nthe subsequent function 1144 01:05:04,820 --> 01:05:08,280 we had this Say block,\ntoo, combined with a join. 1145 01:05:08,280 --> 01:05:11,780 So we have this variable\ncalled Answer, we're joining it 1146 01:05:11,780 --> 01:05:14,202 with that first argument, Hello. 1147 01:05:14,202 --> 01:05:16,161 So already we see that\nsome functions like Join 1148 01:05:16,161 --> 01:05:20,001 can take not one but two arguments,\nor inputs, and that's fine. 1149 01:05:20,001 --> 01:05:24,891 The output of Join is presumably going\n 1150 01:05:24,891 --> 01:05:27,171 or whatever the human typed in. 1151 01:05:27,170 --> 01:05:31,580 That output notice is essentially\n 1152 01:05:31,581 --> 01:05:33,951 Say, just because we've\nkind of stacked things 1153 01:05:33,951 --> 01:05:35,851 or nested them on top of one another. 1154 01:05:35,851 --> 01:05:40,280 But methodically, it's\nreally the same idea. 1155 01:05:40,280 --> 01:05:44,900 The input now are two things,\nHello comma and the return value 1156 01:05:44,900 --> 01:05:47,300 from the previous Ask function. 1157 01:05:47,300 --> 01:05:51,230 The function now is going to be Join,\n 1158 01:05:51,231 --> 01:05:53,061 But that Hello, David\noutput is now going 1159 01:05:53,061 --> 01:05:57,501 to become the input to another function,\n 1160 01:05:57,501 --> 01:06:01,611 and that's then going to have the side\n 1161 01:06:02,670 --> 01:06:06,410 So again as sort of sophisticated\n 1162 01:06:06,411 --> 01:06:09,231 are going to get, they really do\n 1163 01:06:09,231 --> 01:06:12,741 of inputs and outputs and you just have\n 1164 01:06:12,740 --> 01:06:17,030 and to know what kinds of puzzle\n 1165 01:06:17,030 --> 01:06:19,650 But you can ultimately really\nkind of spice these things up. 1166 01:06:19,650 --> 01:06:21,320 Let me go back to my\nprogram here that just is 1167 01:06:21,320 --> 01:06:22,911 using the speech bubble at the moment. 1168 01:06:22,911 --> 01:06:26,090 Scratch's inside has some pretty\n 1169 01:06:26,090 --> 01:06:28,971 I click the Extensions button\nin the bottom left corner. 1170 01:06:28,971 --> 01:06:33,248 And let me go ahead and choose\nthe Text to Speech extension. 1171 01:06:33,248 --> 01:06:36,081 This is using a Cloud service, so\n 1172 01:06:36,081 --> 01:06:39,141 it can actually talk to the\nCloud or a third party service 1173 01:06:39,141 --> 01:06:42,411 and this one is going to give me a\n 1174 01:06:42,411 --> 01:06:45,171 the ability to speak\nsomething from my speakers 1175 01:06:45,170 --> 01:06:47,090 instead of just saying it textually. 1176 01:06:47,090 --> 01:06:48,574 So let me go ahead and drag this. 1177 01:06:48,574 --> 01:06:51,740 Now notice I don't have to interlock\n 1178 01:06:51,740 --> 01:06:52,971 and I want to move some things around. 1179 01:06:52,971 --> 01:06:55,280 I just want to use this as\nlike a canvas temporarily. 1180 01:06:55,280 --> 01:06:58,141 Let me go ahead and\nsteal the Join from here 1181 01:06:58,141 --> 01:07:01,581 put it there, let me throw away\nthe Say block by just moving it 1182 01:07:01,581 --> 01:07:04,641 left and letting go, and\nnow let me join this in 1183 01:07:04,641 --> 01:07:08,581 so I've now changed my program\nto be a little more interesting. 1184 01:07:08,581 --> 01:07:10,341 So now let me stop the old version. 1185 01:07:18,365 --> 01:07:20,240 DAVID MALAN: (LAUGHING)\nOK, minus 2 for real. 1186 01:07:20,240 --> 01:07:27,260 All right, so what I accidentally\n 1187 01:07:27,260 --> 01:07:30,170 for instructional purposes,\nwas the actual answer 1188 01:07:30,170 --> 01:07:32,660 that came back from the ask block. 1189 01:07:33,621 --> 01:07:37,593 So now if I play this again,\nlet's click the green icon. 1190 01:07:50,271 --> 01:07:54,481 OK, so we have these functions then\n 1191 01:07:54,481 --> 01:07:57,441 Well what about those conditionals\n 1192 01:07:57,440 --> 01:07:59,810 How can we bring these programs\nto life so it's not just 1193 01:07:59,811 --> 01:08:01,951 clicking a button and voila,\nsomething's happening? 1194 01:08:01,951 --> 01:08:04,251 Let's go ahead and make this\nnow even more interactive. 1195 01:08:04,251 --> 01:08:06,501 Let me go ahead and throw\naway most of these pieces 1196 01:08:06,501 --> 01:08:09,471 and let me just spice things up\n 1197 01:08:09,471 --> 01:08:12,170 I'm going to go to Play\nSound Meow until done. 1198 01:08:15,251 --> 01:08:18,521 OK, it's a little loud, but it\ndid exactly do what it said. 1199 01:08:21,791 --> 01:08:23,490 It's kind of an underwhelming\nprogram eventually 1200 01:08:23,490 --> 01:08:26,551 since you'd like to think that the\n 1201 01:08:26,940 --> 01:08:28,358 I have to keep hitting the button. 1202 01:08:28,358 --> 01:08:31,386 Well this seems like an opportunity\n 1203 01:08:31,386 --> 01:08:33,511 So all right, well if I\nwanted to meow, meow, meow 1204 01:08:33,511 --> 01:08:37,171 let me just grab a few of these, or you\n 1205 01:08:37,171 --> 01:08:39,360 and you can Copy Paste\neven in code here. 1206 01:08:43,230 --> 01:08:45,228 All right, so now like\nit's not really emoting 1207 01:08:45,228 --> 01:08:46,560 happiness in quite the same way. 1208 01:08:49,421 --> 01:08:52,481 Let me go to Control, wait\none second in between 1209 01:08:52,480 --> 01:08:55,860 which might be a little less worrisome. 1210 01:09:01,451 --> 01:09:06,310 OK, so if my goal was to make\nthe cat meow three times 1211 01:09:06,310 --> 01:09:10,000 I dare say this code or\nalgorithm is correct. 1212 01:09:10,001 --> 01:09:12,940 But let's now critique its design. 1213 01:09:21,501 --> 01:09:27,052 AUDIENCE: You could use the forever\n 1214 01:09:27,052 --> 01:09:28,510 DAVID MALAN: Yeah, so yeah, agreed. 1215 01:09:28,511 --> 01:09:31,301 I could use forever or repeat,\nbut let me push a little harder. 1216 01:09:31,990 --> 01:09:36,070 Like this works, I'm kind of done with\n 1217 01:09:36,070 --> 01:09:37,881 AUDIENCE: There's too much repetition. 1218 01:09:37,881 --> 01:09:40,131 DAVID MALAN: Yeah, there's\ntoo much repetition, right? 1219 01:09:40,131 --> 01:09:42,591 If I wanted to change the\nsound that the cat is making 1220 01:09:42,591 --> 01:09:46,041 to a different variant of meow or\n 1221 01:09:46,041 --> 01:09:48,351 I could change it from the\ndropdown here apparently 1222 01:09:48,350 --> 01:09:51,350 but then I'd have to change it here\n 1223 01:09:51,350 --> 01:09:54,110 and God, if this were even longer\nthat just gets tedious quickly 1224 01:09:54,110 --> 01:09:56,030 and you're probably\nincreasing the probability 1225 01:09:56,030 --> 01:09:57,081 that you're going to\nscrew up and you're going 1226 01:09:57,081 --> 01:10:00,072 to miss one of the dropdowns or\n 1227 01:10:00,072 --> 01:10:02,779 Or, if you wanted to change the\n 1228 01:10:02,779 --> 01:10:05,150 you've got to change it in\ntwo, maybe even more places. 1229 01:10:05,150 --> 01:10:07,581 Again, you're just\ncreating risk for yourself 1230 01:10:07,581 --> 01:10:09,150 and potential bugs in the program. 1231 01:10:09,150 --> 01:10:13,041 So I do like the repeat or the forever\n 1232 01:10:13,041 --> 01:10:15,801 And indeed, what I\nalluded to being possible 1233 01:10:15,801 --> 01:10:18,651 copy pasting earlier, doesn't\nmean it's a good thing. 1234 01:10:18,650 --> 01:10:20,421 And in code, generally\nspeaking, when you 1235 01:10:20,421 --> 01:10:23,841 start to copy and paste puzzle\npieces or text next week 1236 01:10:23,841 --> 01:10:26,910 you're probably not doing\nsomething quite well. 1237 01:10:26,909 --> 01:10:30,680 So let me go ahead and throw away most\n 1238 01:10:30,680 --> 01:10:33,680 keeping just two of the\nblocks that I care about. 1239 01:10:33,680 --> 01:10:37,880 Let me grab the Repeat block for now,\n 1240 01:10:37,881 --> 01:10:40,941 block, it's going to grow to fit\nit, let me reconnect all this 1241 01:10:40,940 --> 01:10:44,449 and change the 10 just\nto a 3, and now, Play. 1242 01:10:51,621 --> 01:10:53,538 It's still correct, but\nnow I've set the stage 1243 01:10:53,537 --> 01:10:57,040 to let the cat meow, for instance,\n 1244 01:10:57,041 --> 01:11:00,431 40 times by changing one thing, or\n 1245 01:11:00,430 --> 01:11:03,159 and just walk away and it\nwill meow forever instead. 1246 01:11:03,159 --> 01:11:05,230 If that's your goal,\nthat would be better. 1247 01:11:05,230 --> 01:11:07,569 A better design but still correct. 1248 01:11:08,319 --> 01:11:10,270 Now that I have a\nprogram that's designed 1249 01:11:10,270 --> 01:11:13,630 to have a cat meow, wow like why? 1250 01:11:13,631 --> 01:11:16,360 I mean, MIT invented\nScratch, Scratch as a cat 1251 01:11:16,359 --> 01:11:18,371 why is there no puzzle\npiece called Meow? 1252 01:11:18,371 --> 01:11:20,081 This feels like a missed opportunity. 1253 01:11:20,081 --> 01:11:22,751 Now to be fair, they gave\nus all the building blocks 1254 01:11:22,751 --> 01:11:26,032 with which we could implement that\n 1255 01:11:26,032 --> 01:11:28,239 and really computer science\nis to leverage what we're 1256 01:11:28,239 --> 01:11:30,489 going to now start calling Abstraction. 1257 01:11:30,489 --> 01:11:34,480 We have step-by-step instructions\nhere, the Repeat, the Play 1258 01:11:34,480 --> 01:11:37,001 and the Wait that collectively\nimplements this idea 1259 01:11:37,001 --> 01:11:38,771 that we humans would call meowing. 1260 01:11:38,770 --> 01:11:41,831 Wouldn't it be nice to abstract\naway those several puzzle 1261 01:11:41,831 --> 01:11:45,701 pieces into just one that literally\n 1262 01:11:45,701 --> 01:11:48,280 Well here's where we\ncan make our own blocks. 1263 01:11:48,280 --> 01:11:52,331 Let me go over here to Scratch\nunder the pink block category 1264 01:11:52,331 --> 01:11:55,932 here and let me click Make a Block. 1265 01:11:55,932 --> 01:11:57,640 Here I see a slightly\ndifferent interface 1266 01:11:57,640 --> 01:12:00,490 where I can choose a name for it\nand I'm going to call it Meow. 1267 01:12:05,180 --> 01:12:08,201 Now I'm just going to\nclean this up a bit here. 1268 01:12:08,201 --> 01:12:12,020 Let me drag and drop Play\nSound and Wait over here. 1269 01:12:13,030 --> 01:12:15,850 I'm just going to drag this\nway down here, way down 1270 01:12:15,850 --> 01:12:18,190 here because now that I'm\ndone implementing Meow 1271 01:12:18,190 --> 01:12:21,050 I'm going to literally abstract\nit away, sort of out of sight 1272 01:12:21,051 --> 01:12:25,301 out of mind, because now notice at\n 1273 01:12:26,711 --> 01:12:31,541 So at this point, I'd argue it doesn't\n 1274 01:12:31,541 --> 01:12:35,020 Frankly, I don't know how Ask\nor Say was implemented by MIT. 1275 01:12:35,020 --> 01:12:37,210 They abstracted those\nthings away for us. 1276 01:12:37,211 --> 01:12:40,121 Now I have a brand new puzzle\npiece that just says what it is. 1277 01:12:40,121 --> 01:12:44,440 And this is now still correct,\nbut arguably better design. 1278 01:12:45,041 --> 01:12:47,440 Because it's just more\nreadable to me, to you 1279 01:12:47,440 --> 01:12:49,690 it's more maintainable\nwhen you look at your code 1280 01:12:49,690 --> 01:12:52,240 a year from now for the first time\n 1281 01:12:52,240 --> 01:12:53,948 back at the very first\nprogram you wrote. 1282 01:12:55,541 --> 01:12:59,230 The function itself has semantics,\n 1283 01:12:59,230 --> 01:13:01,960 If you really care about\nhow Meow is implemented 1284 01:13:01,961 --> 01:13:04,931 you could scroll down and start\nto tinker with the underlying 1285 01:13:04,930 --> 01:13:09,701 implementation details, but otherwise\n 1286 01:13:09,701 --> 01:13:13,211 Now I feel like there's an\neven additional opportunity 1287 01:13:13,211 --> 01:13:17,711 here for abstraction and to factor\n 1288 01:13:17,711 --> 01:13:21,161 It's kind of lame that I\nhave this Repeat block that 1289 01:13:21,161 --> 01:13:24,511 lets me call the Meow function,\n 1290 01:13:25,570 --> 01:13:28,480 Wouldn't it be nice if I could\njust call them Meow function 1291 01:13:28,480 --> 01:13:32,860 aka use the Meow function, and pass\n 1292 01:13:32,860 --> 01:13:35,110 piece how many times I want it to meow? 1293 01:13:35,110 --> 01:13:37,970 Well let me go ahead and\nzoom out and scroll down. 1294 01:13:37,970 --> 01:13:41,560 Let me right click or Control click on\n 1295 01:13:41,560 --> 01:13:44,951 or I could just start from scratch,\n 1296 01:13:44,951 --> 01:13:48,791 Now here, rather than just give this\n 1297 01:13:50,801 --> 01:13:53,081 I'm going to go ahead and\ntype in, for instance, n 1298 01:13:53,081 --> 01:13:56,141 for number of times to\nmeow, and just to make 1299 01:13:56,140 --> 01:13:58,505 this even more user friendly\nand self descriptive 1300 01:13:58,506 --> 01:14:00,881 I'm going to add a label,\nwhich has no functional impact 1301 01:14:00,881 --> 01:14:03,070 it's just an aesthetic,\nand I'm just going 1302 01:14:03,070 --> 01:14:05,740 to say Times, just to make\nit read more like English 1303 01:14:05,740 --> 01:14:08,201 in this case that tells me\nwhat the puzzle piece does. 1304 01:14:09,680 --> 01:14:12,201 And now I need to refine\nthis a little bit. 1305 01:14:12,201 --> 01:14:18,480 Let me go ahead and grab\nunder Control a repeat block 1306 01:14:18,480 --> 01:14:22,180 let me move the Play, Sound,\nand Wait, into the repeat block. 1307 01:14:22,180 --> 01:14:24,720 I don't want 10 and I\nalso don't want 3 here. 1308 01:14:24,720 --> 01:14:29,880 What I want now is this n that is\n 1309 01:14:29,881 --> 01:14:33,271 is creating for me that represents\n 1310 01:14:34,025 --> 01:14:35,400 Notice that snaps right in place. 1311 01:14:35,400 --> 01:14:39,451 Let me connect this and now voila, I\n 1312 01:14:40,951 --> 01:14:44,878 It takes input that affects\nits behavior accordingly. 1313 01:14:44,878 --> 01:14:47,671 Now I'm going to scroll back up,\n 1314 01:14:47,671 --> 01:14:49,021 I just care that Meow exists. 1315 01:14:49,020 --> 01:14:52,560 Now I can tighten up my code, so\nto speak, use even fewer lines 1316 01:14:52,560 --> 01:14:55,740 to do the same thing by\nthrowing away the Repeat block 1317 01:14:55,740 --> 01:15:00,091 reconnecting this new puzzle piece here\n 1318 01:15:00,091 --> 01:15:02,221 now we're really programming, right? 1319 01:15:02,220 --> 01:15:04,560 We've not made any forward\nprogress functionally. 1320 01:15:04,560 --> 01:15:06,871 The thing just mouse three times. 1321 01:15:09,060 --> 01:15:11,371 As you program more and\nmore, these are the kinds 1322 01:15:11,371 --> 01:15:13,860 of instincts still start\nto acquire so that one 1323 01:15:13,860 --> 01:15:16,711 you can start to take a big assignment,\n 1324 01:15:16,711 --> 01:15:20,371 for homework even, that feels kind of\n 1325 01:15:21,541 --> 01:15:25,440 But if you start to identify what are\n 1326 01:15:25,440 --> 01:15:27,390 Then you can start making progress. 1327 01:15:27,390 --> 01:15:32,070 I do this to this day where if I have to\n 1328 01:15:32,070 --> 01:15:36,121 it's so easy to drag my feet and ugh,\n 1329 01:15:36,121 --> 01:15:38,820 until I just start writing\ndown like a to do list 1330 01:15:38,820 --> 01:15:41,490 and I start to modularize the\nprogram and say, all right, well 1331 01:15:41,490 --> 01:15:42,600 what do I want this thing to do? 1332 01:15:43,680 --> 01:15:45,680 I've got to have it say\nsomething on the screen. 1333 01:15:45,680 --> 01:15:48,150 All right, I need to have it\nsay something on the screen 1334 01:15:49,081 --> 01:15:52,421 Like literally a mental or written\n 1335 01:15:52,421 --> 01:15:55,621 if you will, in English on a\npiece of paper or text file 1336 01:15:55,621 --> 01:15:57,810 and then you can decide,\nOK, the first thing I 1337 01:15:57,810 --> 01:16:00,790 need to do for homework to\nsolve this real world problem 1338 01:16:02,161 --> 01:16:04,261 I need to use a bunch\nof other code, too 1339 01:16:04,261 --> 01:16:06,661 but I need to create a\nMeow function and boom 1340 01:16:06,661 --> 01:16:10,440 now you have a piece of the problem\n 1341 01:16:10,440 --> 01:16:14,850 book there, but in this case, we'll\n 1342 01:16:14,850 --> 01:16:16,510 All right, so what more can we do? 1343 01:16:16,511 --> 01:16:18,993 Let's add a few more\npieces to the puzzle here. 1344 01:16:18,993 --> 01:16:20,701 Let's actually interact\nwith the cat now. 1345 01:16:20,701 --> 01:16:24,030 Let me go ahead and now when the\n 1346 01:16:24,030 --> 01:16:26,820 and ask a question using an event here. 1347 01:16:26,820 --> 01:16:31,081 Let me go ahead and\nsay, let's see, I want 1348 01:16:31,081 --> 01:16:34,331 to do something like implement\nthe notion of petting the cat. 1349 01:16:34,331 --> 01:16:39,841 So if the cursor is touching the\n 1350 01:16:39,841 --> 01:16:42,472 it'd be cute if the cat meows\nlike you're petting a cat. 1351 01:16:42,472 --> 01:16:45,180 So I'm going to ask the question,\n 1352 01:16:45,180 --> 01:16:48,370 if let's see I think I need Sensing. 1353 01:16:48,371 --> 01:16:50,761 So if touching mouse\npointer, this is way too big 1354 01:16:50,761 --> 01:16:52,891 but again the shape is\nfine, so there goes. 1355 01:16:53,730 --> 01:16:56,280 And then if it's touching\nthe mouse pointer 1356 01:16:56,280 --> 01:16:59,760 that is if the cat to whom\nthis script or this program 1357 01:16:59,761 --> 01:17:03,031 any time I attach puzzle\npieces MIT calls them a script 1358 01:17:03,030 --> 01:17:07,230 or like a program, if you will, let\n 1359 01:17:07,230 --> 01:17:10,219 and say play sound meow until done. 1360 01:17:10,219 --> 01:17:11,761 All right, so here it is to be clear. 1361 01:17:11,761 --> 01:17:13,891 When the green flag is\nclicked, ask the question 1362 01:17:13,890 --> 01:17:18,180 if the cat is touching the mouse\npointer then place sound meow. 1363 01:17:29,581 --> 01:17:34,351 I'm worried it's not Scratch's\nfault. Feels like mine. 1364 01:17:39,381 --> 01:17:42,231 Yeah, in back, who just turned. 1365 01:17:47,935 --> 01:17:50,810 DAVID MALAN: Yeah, the problem is\n 1366 01:17:50,810 --> 01:17:54,260 Scratch asks the question, is the\n 1367 01:17:54,261 --> 01:17:57,261 And obviously it's not because the\n 1368 01:17:58,430 --> 01:18:01,370 It's fine if I move the cursor\ndown there, but too late. 1369 01:18:01,371 --> 01:18:03,141 The program already asked the question. 1370 01:18:03,140 --> 01:18:06,810 The answer was no or false or zero,\n 1371 01:18:08,190 --> 01:18:10,340 So what might be the solution here be? 1372 01:18:10,341 --> 01:18:12,891 I could move my cursor\nquickly, but that feels 1373 01:18:12,890 --> 01:18:14,900 like never going to work out right. 1374 01:18:18,051 --> 01:18:20,521 Could you use the forever loop? 1375 01:18:21,341 --> 01:18:24,601 So I could indeed use this Forever\n 1376 01:18:24,600 --> 01:18:28,080 to just constantly listen to me, well\n 1377 01:18:28,081 --> 01:18:30,150 or at least forever as\nlong as the program is 1378 01:18:30,150 --> 01:18:32,318 running until I explicitly hit Stop. 1379 01:18:33,150 --> 01:18:36,480 Let me go to Control, let\nme grab the Forever block 1380 01:18:36,480 --> 01:18:40,260 let me move the If inside of this\nForever block, reconnect this 1381 01:18:40,261 --> 01:18:43,951 go back up here, click the green\n 1382 01:18:43,951 --> 01:18:45,721 but let me try moving my cursor now. 1383 01:18:50,381 --> 01:18:52,298 So now the cat is actually\nresponding and it's 1384 01:18:52,297 --> 01:18:54,840 going to keep doing\nthis again and again. 1385 01:18:54,841 --> 01:18:58,651 So now we have this idea of taking these\n 1386 01:18:58,650 --> 01:19:01,501 pieces, assembling them into\nsomething more complicated. 1387 01:19:01,501 --> 01:19:03,621 I could definitely put a name to this. 1388 01:19:03,621 --> 01:19:05,371 I could create a custom\nblock, but for now 1389 01:19:05,371 --> 01:19:08,074 let's just consider what kind\nof more interactivity we can do. 1390 01:19:08,073 --> 01:19:09,240 Let me go ahead and do this. 1391 01:19:09,240 --> 01:19:12,930 By again grabbing a,\nwhen green flag clicked 1392 01:19:12,930 --> 01:19:16,003 let me go ahead and\nclick the video sensing 1393 01:19:16,003 --> 01:19:18,421 and I'm going to rotate the\nlaptop because otherwise we're 1394 01:19:18,421 --> 01:19:21,463 going to get a little inception thing\n 1395 01:19:22,470 --> 01:19:25,980 So I'm going to go reveal to\nyou what's inside the lectern 1396 01:19:29,911 --> 01:19:34,980 Now that we have a non video\nbackdrop, I'm going to say this. 1397 01:19:34,980 --> 01:19:37,110 Instead of the green flag\nclicked, actually, I'm 1398 01:19:37,110 --> 01:19:40,770 going to say when the video motion\n 1399 01:19:40,770 --> 01:19:47,233 measurement of motion, I'm going to go\n 1400 01:19:47,233 --> 01:19:48,940 And then I'm going to\nget out of the way. 1401 01:19:50,511 --> 01:19:53,341 We'll put them on top of there. 1402 01:20:00,451 --> 01:20:03,171 So my hand is moving faster\nthan 50 something or other 1403 01:20:03,171 --> 01:20:05,021 whatever the unit of measure is. 1404 01:20:06,881 --> 01:20:08,298 DAVID MALAN: (LAUGHING) Thank you. 1405 01:20:08,297 --> 01:20:10,960 So now we have an even\nmore interactive version. 1406 01:20:12,081 --> 01:20:15,721 But I think if I sort of slowly. 1407 01:20:18,780 --> 01:20:23,570 It's completely creepy, but I'm\n 1408 01:20:24,360 --> 01:20:27,161 Until finally my hand\nmoves as fast as that. 1409 01:20:27,161 --> 01:20:29,341 And so here actually is\nan opportunity to show you 1410 01:20:29,341 --> 01:20:31,261 something a former student did. 1411 01:20:35,501 --> 01:20:38,391 Let me go ahead and zoom out\nof this in just a moment. 1412 01:20:40,610 --> 01:20:42,360 (LAUGHING) If someone\nwould be comfortable 1413 01:20:42,360 --> 01:20:44,871 coming up not only masked but\nalso on camera on the internet 1414 01:20:44,871 --> 01:20:48,751 I thought we'd play one of your former\n 1415 01:20:48,751 --> 01:20:51,400 Would anyone like to volunteer\nhere and be up on stage? 1416 01:20:57,011 --> 01:20:58,981 Let me get it set up for you here. 1417 01:21:10,180 --> 01:21:13,240 All right, let me go ahead\nand full screen this here. 1418 01:21:13,240 --> 01:21:17,640 So this is whack-a-mole by one\nof your firmer predecessors. 1419 01:21:17,640 --> 01:21:20,670 It's going to use the camera focusing\n 1420 01:21:20,671 --> 01:21:22,328 to position inside of this rectangle. 1421 01:21:22,328 --> 01:21:24,661 Have you ever played the\nwhack-a-mole game at an arcade? 1422 01:21:25,470 --> 01:21:27,617 So for those who haven't,\nthese little moles pop up 1423 01:21:27,618 --> 01:21:29,701 and with a very fuzzy\nhammer you sort of hit down. 1424 01:21:29,701 --> 01:21:31,493 You though, if you\ndon't mind, you're going 1425 01:21:31,493 --> 01:21:34,121 to use your head to do this virtually. 1426 01:21:34,121 --> 01:21:39,121 So let's line up your head with\n 1427 01:21:48,211 --> 01:21:50,400 And now hit the moles with your head. 1428 01:22:14,930 --> 01:22:16,850 All right, a round of\napplause for Sahar. 1429 01:22:24,600 --> 01:22:26,732 So beyond having a\nlittle bit of fun here 1430 01:22:26,733 --> 01:22:28,440 the goal was to\ndemonstrate that by using 1431 01:22:28,440 --> 01:22:31,890 some fairly simple, primitive,\nsome basic building blocks 1432 01:22:31,890 --> 01:22:34,260 but assembling them in a fun\nway with some music, maybe 1433 01:22:34,261 --> 01:22:37,781 some new costumes or artwork, you\n 1434 01:22:37,780 --> 01:22:40,841 But at the end of the day, the\n 1435 01:22:40,841 --> 01:22:43,591 were ones like the ones I just\n 1436 01:22:43,591 --> 01:22:45,311 because there were\nclearly lots of moles. 1437 01:22:45,310 --> 01:22:49,290 So the student probably created a few\n 1438 01:22:49,291 --> 01:22:50,708 but at least four different moles. 1439 01:22:50,707 --> 01:22:53,457 They had like some kind of graphic\n 1440 01:22:54,600 --> 01:22:57,090 There were some kind of\ntimer, maybe a variable 1441 01:22:57,091 --> 01:22:59,676 that every second was counting down. 1442 01:22:59,676 --> 01:23:02,551 So you can imagine taking what looks\n 1443 01:23:02,551 --> 01:23:04,561 at first glance, and\nperhaps overwhelming 1444 01:23:04,560 --> 01:23:07,980 to solve yourself, but just think about\n 1445 01:23:07,980 --> 01:23:12,360 And pluck off one piece of the\npuzzle, so to speak, at a time. 1446 01:23:12,360 --> 01:23:15,100 So indeed if we rewind a little bit. 1447 01:23:15,100 --> 01:23:17,880 Let me go ahead here\nand introduce a program 1448 01:23:17,881 --> 01:23:20,521 that I myself made\nback in graduate school 1449 01:23:20,520 --> 01:23:23,081 when Scratch was first\nbeing developed by MIT. 1450 01:23:23,081 --> 01:23:26,400 Let me go ahead and open\nhere, give me just one second 1451 01:23:26,400 --> 01:23:30,630 something that I called back\nin the day Oscar Time that 1452 01:23:30,631 --> 01:23:32,761 looks a little something like this. 1453 01:23:32,761 --> 01:23:34,591 If I fullscreen it and hit Play. 1454 01:23:34,591 --> 01:23:38,166 [MUSIC - SESAME STREET, "I LOVE TRASH"] 1455 01:23:38,166 --> 01:23:40,041 OSCAR THE GROUCH:\n(SINGING) Oh, I love trash. 1456 01:23:40,041 --> 01:23:42,458 DAVID MALAN: So you'll notice\na piece of trash is falling. 1457 01:23:42,457 --> 01:23:45,930 I can click on it and drag and as I get\n 1458 01:23:45,930 --> 01:23:47,930 OSCAR THE GROUCH: (SINGING)\nAnything ragged or-- 1459 01:23:47,930 --> 01:23:49,820 DAVID MALAN: It wants\nto go in, it seems. 1460 01:23:50,900 --> 01:23:52,400 OSCAR THE GROUCH: (SINGING) Yes, I-- 1461 01:23:54,935 --> 01:23:56,600 OSCAR THE GROUCH: (SINGING) If you\n 1462 01:23:56,600 --> 01:23:57,740 DAVID MALAN: I'll do\nthe same, two points. 1463 01:23:57,740 --> 01:24:00,140 OSCAR THE GROUCH: (SINGING) I have here\n 1464 01:24:00,140 --> 01:24:01,820 DAVID MALAN: There's a\nsneaker falling from the sky 1465 01:24:01,820 --> 01:24:03,288 so another sprite of some sort. 1466 01:24:03,288 --> 01:24:05,246 OSCAR THE GROUCH: (SINGING)\nThe laces are torn. 1467 01:24:07,070 --> 01:24:09,291 DAVID MALAN: I can also\nget just a little lazy 1468 01:24:09,291 --> 01:24:13,234 and just let them fall into the\ntrash themself if I want to. 1469 01:24:13,234 --> 01:24:15,651 So you can see it doesn't have\nto do with my mouse cursor 1470 01:24:15,650 --> 01:24:18,270 it has to do apparently\nwith the distance here. 1471 01:24:18,270 --> 01:24:19,730 Let's listen a little further. 1472 01:24:19,730 --> 01:24:23,300 I think some additional trash\nis about to make its appearance. 1473 01:24:23,301 --> 01:24:26,871 Presumably there's some kind of variable\n 1474 01:24:26,871 --> 01:24:28,371 OSCAR THE GROUCH: (SINGING) I love-- 1475 01:24:28,371 --> 01:24:30,703 DAVID MALAN: OK, let's see\nwhat the last chorus here is. 1476 01:24:30,703 --> 01:24:32,720 OSCAR THE GROUCH:\n(SINGING) Rotten stuff. 1477 01:24:32,720 --> 01:24:35,810 I have here some newspaper, crusty and 1478 01:24:35,810 --> 01:24:37,790 DAVID MALAN: OK, and thus he continues. 1479 01:24:37,791 --> 01:24:40,701 And the song actually\ngoes on and on and on 1480 01:24:40,701 --> 01:24:43,371 and I do not have fond memories\nof implementing this and hearing 1481 01:24:43,371 --> 01:24:46,400 this song for like 10\nstraight hours, but it's 1482 01:24:46,400 --> 01:24:50,055 a good example to just consider\nhow was this program composed? 1483 01:24:50,055 --> 01:24:52,430 How did I go about implementing\nit the first time around? 1484 01:24:52,430 --> 01:24:54,755 And let me go ahead and\nopen up some programs now 1485 01:24:54,756 --> 01:24:56,631 that I wrote in advance\njust so that we could 1486 01:24:56,631 --> 01:24:58,911 see how these things are assembled. 1487 01:24:58,911 --> 01:25:02,331 Honestly, the first thing\nI probably did was probably 1488 01:25:02,331 --> 01:25:04,730 to do something a little like this. 1489 01:25:04,730 --> 01:25:07,161 Here is just a version\nof the program where 1490 01:25:07,161 --> 01:25:09,860 I set out to solve\njust one problem first 1491 01:25:09,860 --> 01:25:12,121 of planting a lamp post in the program. 1492 01:25:12,621 --> 01:25:14,391 I kind of had a vision of what I wanted. 1493 01:25:14,390 --> 01:25:15,980 You know, it evolved\nover time, certainly 1494 01:25:15,980 --> 01:25:17,420 but I knew I wanted\ntrash to fall, I wanted 1495 01:25:17,421 --> 01:25:19,128 a cute little Oscar\nthe Grouch to pop out 1496 01:25:19,128 --> 01:25:22,041 of the trashcan, and some other\nstuff, but wow that's a lot 1497 01:25:23,690 --> 01:25:26,570 I'm going to start easy, download\na picture of a lamp post 1498 01:25:26,570 --> 01:25:30,951 and then drag and drop it into the\n 1499 01:25:31,970 --> 01:25:33,740 It doesn't functionally do anything. 1500 01:25:33,740 --> 01:25:36,831 I mean, literally that's the\ncode that I wrote to do this. 1501 01:25:36,831 --> 01:25:38,990 All I did was use like\nthe Backdrops feature 1502 01:25:38,990 --> 01:25:41,041 and drag and drop and\nmove things around 1503 01:25:41,041 --> 01:25:44,181 but it got me to version\none of my program. 1504 01:25:44,180 --> 01:25:46,010 Then what might version two be? 1505 01:25:46,011 --> 01:25:48,591 Well I considered what\npiece of functionality 1506 01:25:48,591 --> 01:25:52,083 frankly might be the easiest to\n 1507 01:25:52,082 --> 01:25:54,290 That seems like a pretty\ncore piece of functionality. 1508 01:25:54,291 --> 01:25:56,551 It just needs to sit\nthere most of the time. 1509 01:25:56,551 --> 01:25:59,931 So the next thing I\nprobably did was to open up 1510 01:25:59,930 --> 01:26:05,600 for instance, the trash can version\n 1511 01:26:06,113 --> 01:26:08,030 So this time I'll show\nyou what's inside here. 1512 01:26:08,030 --> 01:26:10,400 There is some code, but not much. 1513 01:26:10,400 --> 01:26:14,421 Notice at bottom right I change the\n 1514 01:26:14,421 --> 01:26:17,451 instead, but it's the same\nprinciple that I can control. 1515 01:26:17,451 --> 01:26:20,121 And then over here I added this code. 1516 01:26:20,121 --> 01:26:23,030 When the green flag is\nclicked, switch the costume 1517 01:26:23,030 --> 01:26:25,110 to something I arbitrarily\ncalled Oscar 1. 1518 01:26:25,110 --> 01:26:26,871 So I found a couple\nof different pictures 1519 01:26:26,871 --> 01:26:29,990 of a trash can, one that looks\n 1520 01:26:29,990 --> 01:26:32,150 and eventually one that\nhas Oscar coming out 1521 01:26:32,150 --> 01:26:33,810 and I just gave them different names. 1522 01:26:33,810 --> 01:26:36,980 So I said Switch to Oscar 1, which\nis the closed one by default 1523 01:26:36,980 --> 01:26:40,791 then forever do the following:\nif touching the mouse pointer 1524 01:26:40,791 --> 01:26:45,711 then switch the costume to\nOscar 2, else switch to Oscar 1. 1525 01:26:45,711 --> 01:26:49,311 That is to say, I just wanted to\n 1526 01:26:49,310 --> 01:26:52,130 and closing, even if it's not\nexactly what I wanted ultimately 1527 01:26:52,131 --> 01:26:54,181 I just wanted to make\nsome forward progress. 1528 01:26:54,180 --> 01:26:59,940 So here, when I run this program by\n 1529 01:26:59,940 --> 01:27:03,440 Nothing yet, but if I get\ncloser to the trash can 1530 01:27:03,440 --> 01:27:07,310 it indeed pops open because\nit's forever listening 1531 01:27:07,310 --> 01:27:10,500 for whether the sprite,\nthe trash can in this case 1532 01:27:10,501 --> 01:27:11,751 is touching the mouse pointer. 1533 01:27:12,440 --> 01:27:15,501 That was version 2, if you will. 1534 01:27:15,501 --> 01:27:18,650 If I went in now and added the lamp\n 1535 01:27:18,650 --> 01:27:20,150 now we're starting to make progress. 1536 01:27:20,360 --> 01:27:22,940 Now it would look a little\nsomething more like the program 1537 01:27:22,940 --> 01:27:25,130 I intended ultimately to create. 1538 01:27:25,131 --> 01:27:27,961 What piece did I probably\nbite off after that? 1539 01:27:27,961 --> 01:27:30,231 Well, I think what I did\nis I probably decided 1540 01:27:30,230 --> 01:27:33,740 let me implement one of the pieces of\n 1541 01:27:34,350 --> 01:27:37,880 Let's just get one piece of\ntrash working correctly first. 1542 01:27:37,881 --> 01:27:40,280 So let me go ahead and open this one. 1543 01:27:40,280 --> 01:27:43,850 And again, all of these examples will\n 1544 01:27:43,850 --> 01:27:46,040 so you can see all of\nthese examples, too. 1545 01:27:46,041 --> 01:27:48,440 It's not terribly long, I\njust implement it in advance 1546 01:27:48,440 --> 01:27:50,810 so we could flip\nthrough kind of quickly. 1547 01:27:52,341 --> 01:27:56,151 On the right hand side, I turned\nmy sprite into a piece of trash 1548 01:27:56,150 --> 01:27:58,770 this time instead of a cat,\ninstead of a trash can 1549 01:27:58,770 --> 01:28:04,130 and I also created, with Carter's help,\n 1550 01:28:04,131 --> 01:28:07,191 It's literally just a black line\nbecause I just wanted initially 1551 01:28:07,190 --> 01:28:09,621 to have some notion of a\nfloor so I could detect 1552 01:28:09,621 --> 01:28:12,141 if the trash is touching the floor. 1553 01:28:12,140 --> 01:28:15,411 Now without seeing the code yet,\njust hearing that description 1554 01:28:15,411 --> 01:28:20,301 why might I have wanted the second\n 1555 01:28:20,301 --> 01:28:22,748 with the trash intending\nto fall from the sky? 1556 01:28:22,747 --> 01:28:24,080 What might I have been thinking? 1557 01:28:24,081 --> 01:28:26,041 Like what problem might\nI be trying to solve? 1558 01:28:26,270 --> 01:28:28,730 AUDIENCE: You don't want the\nfirst sprite to go through it. 1559 01:28:28,730 --> 01:28:31,688 DAVID MALAN: Yeah, you don't want\n 1560 01:28:31,689 --> 01:28:34,280 go through, and then boom,\nyou completely lose it. 1561 01:28:34,280 --> 01:28:36,932 That would not be a very useful thing. 1562 01:28:36,932 --> 01:28:39,890 Or it would seem to maybe eat up more\n 1563 01:28:39,890 --> 01:28:42,560 if the trash is just endlessly\nfalling and I can't grab it. 1564 01:28:42,560 --> 01:28:44,331 It might be a little traumatic\nif you tried to get it 1565 01:28:44,331 --> 01:28:46,873 and you can't pull it back out\nand you can't fix the program. 1566 01:28:46,872 --> 01:28:48,548 So I just wanted the thing to stop. 1567 01:28:48,548 --> 01:28:50,091 So how might I have implemented this? 1568 01:28:50,091 --> 01:28:51,740 Let's look at the code at left. 1569 01:28:51,740 --> 01:28:56,360 Here I have a bit of randomness,\nlike I proposed earlier exists. 1570 01:28:56,360 --> 01:28:59,240 There's this blue\nfunction called Go To x 1571 01:28:59,240 --> 01:29:03,051 y that lets me move a\nsprite to any position 1572 01:29:03,051 --> 01:29:07,731 up, down, left, right, I picked a random\n 1573 01:29:07,730 --> 01:29:12,470 negative 240 to positive 240, and then\n 1574 01:29:12,470 --> 01:29:14,220 This just makes the\ngame more interesting. 1575 01:29:14,220 --> 01:29:18,410 It's kind of lame pretty quickly if the\n 1576 01:29:18,411 --> 01:29:21,650 Here's this a little bit of randomness,\n 1577 01:29:23,121 --> 01:29:26,694 So now if I click the green flag,\nyou'll see that it just falls 1578 01:29:26,694 --> 01:29:28,611 nothing interesting is\ngoing to happen, but it 1579 01:29:28,610 --> 01:29:33,050 does stop when it touches the black\n 1580 01:29:33,051 --> 01:29:37,821 I'm forever asking the question if\n 1581 01:29:37,820 --> 01:29:41,550 is to the floor is greater\nthan zero, that's fine. 1582 01:29:41,551 --> 01:29:44,361 Change the y location by negative 3. 1583 01:29:44,360 --> 01:29:48,860 So move it down 3 pixels, down 3\n 1584 01:29:48,860 --> 01:29:52,701 is not greater than zero, it is zero\n 1585 01:29:52,701 --> 01:29:54,427 it should just stop moving altogether. 1586 01:29:54,427 --> 01:29:56,510 There's other ways we could\nhave implemented this 1587 01:29:56,511 --> 01:29:58,731 but this felt like a nice,\nclean way that logically, just 1588 01:29:59,523 --> 01:30:03,711 OK, now I got some trash falling, I\n 1589 01:30:03,711 --> 01:30:08,211 I have a lamp post, now I'm a\ngood three steps into the program. 1590 01:30:09,411 --> 01:30:12,411 If we consider one or two\nfinal pieces, something 1591 01:30:12,411 --> 01:30:17,570 like the dragging of the trash, let me\n 1592 01:30:17,570 --> 01:30:21,171 Dragging the trash requires\na different type of question. 1593 01:30:24,121 --> 01:30:26,900 I only need one sprite, no\nfloor here because I just 1594 01:30:26,900 --> 01:30:29,720 want the human to move it up,\ndown, left, right and the human's 1595 01:30:29,720 --> 01:30:32,750 not going to physically be able\nto move it outside of the world. 1596 01:30:32,751 --> 01:30:36,480 If we zoom in on this code, the way\n 1597 01:30:36,480 --> 01:30:40,670 We're using that And conjunction\n 1598 01:30:40,671 --> 01:30:44,301 the green flag is clicked, we're\n 1599 01:30:44,301 --> 01:30:47,571 these questions, plural,\nif the mouse is down 1600 01:30:47,570 --> 01:30:53,211 and the trash is touching the\nmouse pointer, that's equivalent 1601 01:30:53,211 --> 01:30:55,521 logically to clicking on the trash. 1602 01:30:55,520 --> 01:30:58,831 Go ahead and move the\ntrash to the mouse pointer. 1603 01:30:58,831 --> 01:31:00,891 So again it takes this\nvery familiar idea 1604 01:31:00,890 --> 01:31:04,010 that you and I take for granted\n 1605 01:31:06,621 --> 01:31:10,820 Well Mac OS or Windows are\nprobably asking a question. 1606 01:31:10,820 --> 01:31:15,320 For every icon, is the mouse down\n 1607 01:31:15,320 --> 01:31:19,461 If so, go to the location\nof the mouse forever 1608 01:31:19,461 --> 01:31:21,561 while the mouse button is clicked down. 1609 01:31:21,560 --> 01:31:23,720 So how does this work in reality now? 1610 01:31:23,720 --> 01:31:25,970 Let me go ahead and click on the Play. 1611 01:31:25,970 --> 01:31:30,030 Nothing happens at first, but if\n 1612 01:31:33,121 --> 01:31:36,871 So I now need to kind of combine\n 1613 01:31:36,871 --> 01:31:39,604 but I bet I could just start\nto use just one single program. 1614 01:31:39,604 --> 01:31:42,021 Right now I'm using separate\nones to show different ideas 1615 01:31:42,020 --> 01:31:45,170 but now that's another\nbite out of the problem. 1616 01:31:45,171 --> 01:31:48,028 If we do one last one,\nsomething like the scorekeeping 1617 01:31:48,028 --> 01:31:51,110 is interesting, because recall that\n 1618 01:31:51,110 --> 01:31:54,661 into the can, Oscar popped out\nand told us the current score. 1619 01:31:54,661 --> 01:31:59,240 So let me go ahead and find\nthis one, Oscar variables 1620 01:31:59,240 --> 01:32:01,220 and let me zoom in on this one. 1621 01:32:01,220 --> 01:32:04,423 This one is longer because we\ncombined all of these elements. 1622 01:32:04,423 --> 01:32:07,341 So this is the kind of thing that\n 1623 01:32:07,341 --> 01:32:09,861 I have no idea how I would\nhave implemented this 1624 01:32:09,860 --> 01:32:12,140 from nothing, from scratch literally. 1625 01:32:12,140 --> 01:32:16,490 But again, if you take your\nvision and componenitize it 1626 01:32:16,490 --> 01:32:18,591 into these smaller,\nbite-sized problems, you 1627 01:32:18,591 --> 01:32:21,021 could take these baby\nsteps, so to speak, and then 1628 01:32:21,020 --> 01:32:22,590 solve everything collectively. 1629 01:32:22,591 --> 01:32:25,761 So what's new here is this bottom one. 1630 01:32:25,761 --> 01:32:30,411 Forever do the following:\nif the trash is touching 1631 01:32:30,411 --> 01:32:33,860 Oscar, the other sprite that\nwe've now added to the program 1632 01:32:35,841 --> 01:32:37,851 This is an orange and\nindeed if we poke around 1633 01:32:37,850 --> 01:32:42,411 we'll see that orange is a variable,\n 1634 01:32:42,411 --> 01:32:46,400 changing it means to add 1 or\nif it's negative subtract 1. 1635 01:32:46,400 --> 01:32:51,470 Then go ahead and have the\ntrash go to pick random. 1636 01:32:53,091 --> 01:32:56,941 Well, let me show you what it's doing\n 1637 01:32:58,341 --> 01:33:01,490 All right, it's falling, I'm clicking\n 1638 01:33:02,774 --> 01:33:04,191 All right, let me do it once more. 1639 01:33:06,831 --> 01:33:13,371 Why do I have this function at the\n 1640 01:33:13,371 --> 01:33:16,041 Like what problem is this solving here? 1641 01:33:18,132 --> 01:33:21,569 AUDIENCE: Just the same\ntrack teleported to the top 1642 01:33:21,569 --> 01:33:23,385 after you put it in the trash can. 1643 01:33:24,511 --> 01:33:27,303 Even though the human perceives\n 1644 01:33:27,302 --> 01:33:29,270 from the sky, it's\nactually the same piece 1645 01:33:29,270 --> 01:33:32,150 of trash, just kind of being\nmagically moved back to the top 1646 01:33:33,650 --> 01:33:36,530 There, too, you have this\nidea of reusable code. 1647 01:33:36,530 --> 01:33:40,161 If you were constantly copying\nand pasting your pieces of trash 1648 01:33:40,161 --> 01:33:43,190 and creating 20 pieces of trash, 30\n 1649 01:33:43,190 --> 01:33:46,820 want the game to have that many\n 1650 01:33:46,820 --> 01:33:49,911 Reuse the code that you wrote,\nreuse the sprites that you wrote 1651 01:33:49,911 --> 01:33:54,681 and that would give you not just\n 1652 01:33:54,680 --> 01:33:57,665 Well let's take a look at one\nfinal set of building blocks 1653 01:33:57,666 --> 01:33:59,541 that we can compose\nultimately into something 1654 01:33:59,541 --> 01:34:02,011 particularly interactive as follows. 1655 01:34:02,011 --> 01:34:03,921 Let me go ahead and\nzoom out here and let 1656 01:34:03,921 --> 01:34:08,570 me propose that we implement something\n 1657 01:34:09,990 --> 01:34:13,310 So I want to implement\nsome maze-based game that 1658 01:34:13,310 --> 01:34:15,060 looks at first glance like this. 1659 01:34:15,735 --> 01:34:18,110 It's not a very fun game yet,\nbut here's a little Harvard 1660 01:34:18,110 --> 01:34:22,190 shield, a couple of black lines, this\n 1661 01:34:22,190 --> 01:34:24,230 but notice you can't\nquite see my hand here 1662 01:34:24,230 --> 01:34:28,948 but I'm using my arrow keys to go down,\n 1663 01:34:28,948 --> 01:34:31,490 but if I keep going right, right,\nright, right, right, right 1664 01:34:31,490 --> 01:34:32,841 right it's not going anywhere. 1665 01:34:32,841 --> 01:34:35,320 And left, left, left, left, left, left,\n 1666 01:34:36,740 --> 01:34:41,240 So before we look at the code,\nhow might this be working? 1667 01:34:41,240 --> 01:34:44,990 What kinds of scripts,\ncollections of puzzle pieces 1668 01:34:44,990 --> 01:34:47,490 might collectively\nhelp us implement this? 1669 01:34:57,671 --> 01:35:00,838 There's probably some question being\n 1670 01:35:00,838 --> 01:35:03,131 and it happens to be a couple\nof sprites, each of which 1671 01:35:03,131 --> 01:35:06,381 is just literally a vertical black line\n 1672 01:35:07,451 --> 01:35:10,451 Is the distance to it\nzero or close to zero? 1673 01:35:10,451 --> 01:35:16,131 And if so, we just ignore the left\n 1674 01:35:16,871 --> 01:35:18,940 But otherwise, if we're\nnot touching a wall 1675 01:35:18,940 --> 01:35:22,480 what are we probably doing\ninstead forever here? 1676 01:35:22,480 --> 01:35:24,820 How is the movement working presumably? 1677 01:35:33,690 --> 01:35:35,564 DAVID MALAN: Sorry, say a little louder. 1678 01:35:35,564 --> 01:35:38,798 AUDIENCE: Presumably it's continually\n 1679 01:35:38,798 --> 01:35:40,190 and then moving when you do. 1680 01:35:41,065 --> 01:35:44,930 It's continually, forever listening for\n 1681 01:35:44,930 --> 01:35:47,270 and if the up arrow is\npressed, we're probably 1682 01:35:47,270 --> 01:35:49,800 changing the y by a positive value. 1683 01:35:49,801 --> 01:35:52,401 If the down arrow is pressed,\nwe're going down by y 1684 01:35:52,400 --> 01:35:54,090 and left and right accordingly. 1685 01:35:54,091 --> 01:35:55,671 So let's actually take a quick look. 1686 01:35:55,671 --> 01:35:59,001 If I zoom out here and take a look\n 1687 01:35:59,001 --> 01:36:01,556 there's a lot going on at\nfirst glance, but let's see. 1688 01:36:01,555 --> 01:36:03,680 First of all, let me drag\nsome stuff out of the way 1689 01:36:03,680 --> 01:36:05,840 because it's kind of\noverwhelming at first glance 1690 01:36:05,841 --> 01:36:09,291 especially if you, for instance, were\n 1691 01:36:09,291 --> 01:36:11,934 0 just to get inspiration,\nmost projects out there 1692 01:36:11,934 --> 01:36:13,851 are going to look\noverwhelming at first glance 1693 01:36:13,850 --> 01:36:16,320 until you start to wrap your\nmind around what's going on. 1694 01:36:16,320 --> 01:36:19,581 But in this case, we've\nimplemented some abstractions 1695 01:36:19,581 --> 01:36:22,730 from the get go to explain to\n 1696 01:36:24,680 --> 01:36:29,100 This is that program with the two black\n 1697 01:36:30,081 --> 01:36:33,801 It initially puts the shield\nin the middle, 0,0, then 1698 01:36:33,801 --> 01:36:37,251 forever listens for keyboard,\nas I think you were describing 1699 01:36:37,251 --> 01:36:40,461 and it feels for the walls, as\nI think you were describing. 1700 01:36:42,860 --> 01:36:46,400 These are custom blocks we created\n 1701 01:36:46,400 --> 01:36:48,650 those implementation details\nbecause honestly that's 1702 01:36:48,650 --> 01:36:50,130 all I need to know right now. 1703 01:36:50,131 --> 01:36:52,851 But, as aspiring programmers,\nif we're curious now 1704 01:36:52,850 --> 01:36:55,460 let's scroll down to the\nactual implementation 1705 01:36:57,261 --> 01:36:59,701 This is the one on the left\nand it is a little long 1706 01:36:59,701 --> 01:37:02,461 but it's a lot of similar structure. 1707 01:37:02,461 --> 01:37:07,371 We're doing the following, if the up\n 1708 01:37:08,030 --> 01:37:11,600 If the down arrow is pressed,\nthen change y by negative 1. 1709 01:37:12,341 --> 01:37:15,361 Right arrow, left arrow, and that's it. 1710 01:37:15,360 --> 01:37:17,990 So it just assembles all\nof those ideas, combines it 1711 01:37:17,990 --> 01:37:20,451 into one new block just because\nit's kind of overwhelming 1712 01:37:20,451 --> 01:37:22,680 let's just implement it\nonce and tuck it away. 1713 01:37:22,680 --> 01:37:26,720 And if we scroll now over to\nthe Feel for Walls function 1714 01:37:26,720 --> 01:37:30,080 this now is asking the\nquestion as hypothesized 1715 01:37:30,081 --> 01:37:35,001 if I'm touching the left wall, change my\n 1716 01:37:35,720 --> 01:37:38,660 If I'm touching the right\nwall, then move x by negative 1 1717 01:37:38,661 --> 01:37:40,592 to move a little bit away from it. 1718 01:37:40,592 --> 01:37:42,050 So it kind of bounces off the wall. 1719 01:37:42,051 --> 01:37:47,451 Just in case it slightly went over, we\n 1720 01:37:47,451 --> 01:37:51,230 All right, then a couple of\nmore pieces here to introduce. 1721 01:37:51,230 --> 01:37:54,440 What if we want to actually add\n 1722 01:37:55,310 --> 01:38:02,030 Well, let me go ahead to maybe this one\n 1723 01:38:02,030 --> 01:38:05,458 might, for instance, be designed to\n 1724 01:38:05,458 --> 01:38:08,751 This is like a maze and you're trying to\n 1725 01:38:09,810 --> 01:38:14,360 Uh oh, Yale is in the way and it\n 1726 01:38:15,541 --> 01:38:16,791 Well, let me ask someone else. 1727 01:38:19,251 --> 01:38:21,381 This is an idea you have,\nthis as an idea you see. 1728 01:38:21,381 --> 01:38:26,490 Let's reverse engineer in\nyour head how it works. 1729 01:38:28,935 --> 01:38:32,655 AUDIENCE: If the Yale symbol is\n 1730 01:38:34,639 --> 01:38:36,431 DAVID MALAN: Yeah, so\nif the Yale symbol is 1731 01:38:36,430 --> 01:38:39,130 touching the left wall or the right\n 1732 01:38:39,131 --> 01:38:41,756 And indeed we'll see there's a\npuzzle piece that can do exactly 1733 01:38:41,756 --> 01:38:44,066 that technically off\nthe edge, as we'll see 1734 01:38:44,065 --> 01:38:45,690 but there's another way we can do this. 1735 01:38:46,751 --> 01:38:49,271 The way we ourselves\ncan implement exactly 1736 01:38:49,270 --> 01:38:52,010 that idea bounce is just\nwith a little bit of logic. 1737 01:38:52,011 --> 01:38:54,581 So here's what this version\nof the program is doing. 1738 01:38:54,581 --> 01:38:58,721 It's moving Yale by default to 0,0\n 1739 01:38:58,720 --> 01:39:03,250 pointing it direction 90 degrees, which\n 1740 01:39:03,251 --> 01:39:06,490 and then it's forever doing\nthis: if touching the left wall 1741 01:39:06,490 --> 01:39:10,451 or touching the right wall,\nhere's our translation of bounce. 1742 01:39:10,451 --> 01:39:11,953 We're just turning 180 degrees. 1743 01:39:11,953 --> 01:39:13,661 And the nice thing\nabout that is we don't 1744 01:39:13,661 --> 01:39:16,368 have to worry if we're going from\n 1745 01:39:16,368 --> 01:39:19,480 180 degrees is going to\nwork on both of the walls. 1746 01:39:21,310 --> 01:39:24,670 After we do that, we just move\none step, one pixel, at a time 1747 01:39:24,671 --> 01:39:28,150 but we're doing it forever so\nsomething is happening continually 1748 01:39:28,150 --> 01:39:30,640 and the Yale icon is\nbouncing back and forth. 1749 01:39:30,640 --> 01:39:33,550 Well one final piece\nhere, what if now we 1750 01:39:33,551 --> 01:39:39,711 want another adversary, a more advanced\n 1751 01:39:39,711 --> 01:39:44,831 to go and follow us wherever\nwe are such that this time 1752 01:39:44,831 --> 01:39:51,161 we want the other sprite to\nnot just bounce back and forth 1753 01:39:51,161 --> 01:39:55,600 but literally follow us\nno matter where we go. 1754 01:39:55,600 --> 01:39:59,110 How might this be\nimplemented on the screen? 1755 01:39:59,110 --> 01:40:01,541 I bet it's another forever\nblock, but what's inside? 1756 01:40:01,541 --> 01:40:05,621 AUDIENCE: So forever get the\n 1757 01:40:05,621 --> 01:40:06,844 and move one step towards it. 1758 01:40:06,844 --> 01:40:09,761 DAVID MALAN: Yeah, forever point at\n 1759 01:40:11,140 --> 01:40:15,340 This is just going to go on forever if I\n 1760 01:40:15,341 --> 01:40:18,551 Notice it's sort of twitching\nback and forth because it goes one 1761 01:40:18,551 --> 01:40:20,201 pixel then one pixel then one pixel. 1762 01:40:20,201 --> 01:40:21,911 It's sort of in a frantic state here. 1763 01:40:21,911 --> 01:40:25,421 We haven't finished the game yet, but if\n 1764 01:40:25,421 --> 01:40:28,121 It didn't take much to\nimplement this simple idea. 1765 01:40:28,121 --> 01:40:30,911 Go to a random position just\nto make it kind of fair 1766 01:40:30,911 --> 01:40:33,740 initially, then forever\npoint towards Harvard 1767 01:40:33,740 --> 01:40:37,060 which is what we called the Harvard\ncrest sprite, move one step. 1768 01:40:37,060 --> 01:40:39,701 Suppose we now wanted to\nmake a more advanced level. 1769 01:40:39,701 --> 01:40:42,610 What's a minor change I could\nlogically make to this code just 1770 01:40:42,610 --> 01:40:45,037 to make MIT even better at this? 1771 01:40:45,037 --> 01:40:46,870 AUDIENCE: Change the\nnumber of steps to two. 1772 01:40:46,871 --> 01:40:48,310 DAVID MALAN: All right, change\nthe number of steps to two. 1773 01:40:49,390 --> 01:40:51,370 So now they got twice as fast. 1774 01:40:51,371 --> 01:40:53,751 Let me go ahead and just\nget this out of the way. 1775 01:40:53,751 --> 01:40:56,841 Oops, let me make it a fair fight. 1776 01:40:58,310 --> 01:41:01,530 All right, I unfortunately am\nstill moving one pixel at a time 1777 01:41:01,530 --> 01:41:03,081 so this isn't going to end well. 1778 01:41:04,220 --> 01:41:10,230 And if we're really aggressive and\n 1779 01:41:11,930 --> 01:41:16,370 Jesus, OK, so that's how you might\n 1780 01:41:17,761 --> 01:41:20,694 So it's not an accident that we\nchose these particular examples 1781 01:41:20,694 --> 01:41:23,361 here involving these particular\nschools because we have one more 1782 01:41:23,360 --> 01:41:25,680 demonstration we thought\nwe'd introduce today 1783 01:41:25,680 --> 01:41:30,020 if we could get one other\nvolunteer to come up and play 1784 01:41:30,020 --> 01:41:34,400 what was called by one of your\npredecessors Ivy's Hardest Game. 1785 01:41:34,400 --> 01:41:35,661 Let's see, you in the middle. 1786 01:41:40,161 --> 01:41:41,993 DAVID MALAN: Come a\nlittle closer, actually. 1787 01:41:44,440 --> 01:41:47,067 All right, round of applause\nhere if we could, too. 1788 01:41:53,237 --> 01:41:54,487 OK, sorry, what was your name? 1789 01:41:59,020 --> 01:42:02,680 So here we have on this other\nscreen Ivy's Hardest Game 1790 01:42:02,680 --> 01:42:04,750 written by a former CS50 student. 1791 01:42:04,751 --> 01:42:07,451 I think you'll see that it\ncombines these same principles. 1792 01:42:07,451 --> 01:42:09,940 The maze is clearly a\nlittle more advanced. 1793 01:42:09,940 --> 01:42:14,320 The goal at hand is to initially move\n 1794 01:42:14,320 --> 01:42:17,108 the way on the right so that you\ncatch up to him in this case 1795 01:42:17,108 --> 01:42:18,940 but you'll see that\nthere's different levels 1796 01:42:18,940 --> 01:42:21,350 and different levels of sophistication. 1797 01:42:21,350 --> 01:42:24,820 So if you're up for it, you can use just\n 1798 01:42:24,820 --> 01:42:27,520 You'll be controlling the\nHarvard sprite and if we 1799 01:42:27,520 --> 01:42:32,291 could raise the volume just a little\n 1800 01:42:32,291 --> 01:42:34,421 Here we go, clicking the green flag. 1801 01:42:41,444 --> 01:42:43,194 [MUSIC - MC HAMMER, "U CAN\'T TOUCH\nTHIS"] 1802 01:42:43,194 --> 01:42:45,100 MC HAMMER: (SINGING) Can't touch this. 1803 01:42:57,591 --> 01:42:58,801 MC HAMMER: (SINGING) so hard. 1804 01:42:58,801 --> 01:43:00,551 Makes me want to say, oh my Lord. 1805 01:43:03,520 --> 01:43:06,162 MC HAMMER: (SINGING) Feels\ngood when you know you're down. 1806 01:43:10,350 --> 01:43:12,520 MC HAMMER: (SINGING)\nYou can't touch this. 1807 01:43:17,859 --> 01:43:19,791 MC HAMMER: (SINGING) Can't touch this. 1808 01:43:23,618 --> 01:43:25,201 You let me bust the funky lyrics. 1809 01:43:27,314 --> 01:43:29,481 You got it like that and\nyou know you want to dance. 1810 01:43:29,480 --> 01:43:33,302 So move out of your seat and get\na fly girl and catch this beat. 1811 01:43:34,934 --> 01:43:38,283 Pump a little bit and let them know\n 1812 01:43:38,283 --> 01:43:39,740 Cold on a mission, so fall on back. 1813 01:43:39,740 --> 01:43:41,033 Let them know that you're too-- 1814 01:43:46,378 --> 01:43:48,798 MC HAMMER: (SINGING) Can't touch this. 1815 01:43:48,798 --> 01:43:50,251 Why you standing there, man? 1816 01:43:54,761 --> 01:43:58,316 Give me a song or rhythm, making\n 1817 01:44:00,220 --> 01:44:02,553 You talking the Hammer when\nyou're talking about a show. 1818 01:44:03,751 --> 01:44:06,601 Singers are sweating so them\na wipe or a tame to learn. 1819 01:44:06,600 --> 01:44:08,110 DAVID MALAN: Second to last level. 1820 01:44:08,610 --> 01:44:10,451 MC HAMMER: (SINGING) That chart's legit. 1821 01:44:10,451 --> 01:44:13,201 Either work hard or\nyou might as well quit. 1822 01:44:18,091 --> 01:44:20,391 MC HAMMER: (SINGING)\nYou can't touch this. 1823 01:44:20,390 --> 01:44:22,412 DAVID MALAN: You're almost there. 1824 01:44:22,412 --> 01:44:23,870 MC HAMMER: (SINGING) Break it down. 1825 01:44:36,161 --> 01:44:38,266 MC HAMMER: (SINGING) Stop, Hammer time. 1826 01:44:38,266 --> 01:44:39,878 Go with the flow," it is said. 1827 01:44:39,878 --> 01:44:42,171 If you can't groove to this,\nthen you're probably dead. 1828 01:44:42,171 --> 01:44:44,240 So wave your hands in the\nair, bust a few moves 1829 01:44:44,240 --> 01:44:45,140 run your fingers through your hair. 1830 01:44:46,621 --> 01:44:49,051 Dance to this and you're\ngoing to get thinner. 1831 01:44:50,650 --> 01:44:53,305 Just for a minute let's all do the bump. 1832 01:45:03,400 --> 01:45:05,405 All right, that's it for CS50. 1833 01:46:29,360 --> 01:46:33,470 And this is week 1, the one in which\n 1834 01:46:33,470 --> 01:46:36,350 is something we technically said\n 1835 01:46:36,350 --> 01:46:39,650 played with this graphical language\n 1836 01:46:41,331 --> 01:46:43,461 But today, as promised,\nwe transition to something 1837 01:46:43,461 --> 01:46:45,921 a little more traditional,\na little more text-based 1838 01:46:45,921 --> 01:46:48,501 not puzzle piece- or\nblock-based, known as C. 1839 01:46:49,770 --> 01:46:50,978 It's been around for decades. 1840 01:46:50,979 --> 01:46:54,501 But it's a language that underlies so\n 1841 01:46:54,501 --> 01:46:57,261 among them something called\nPython that we'll also 1842 01:46:57,261 --> 01:46:58,821 come to in a few weeks' time. 1843 01:46:58,820 --> 01:47:00,921 Indeed, at the end of\nthe semester, the goal 1844 01:47:00,921 --> 01:47:02,661 is for you to feel that\nyou've not learned Scratch 1845 01:47:02,661 --> 01:47:04,911 you've not learned C, or\neven Python, for that matter 1846 01:47:04,911 --> 01:47:07,371 but fundamentally that you've\nlearned how to program. 1847 01:47:07,371 --> 01:47:09,801 Unfortunately, when you\nlearn how to program 1848 01:47:09,801 --> 01:47:13,681 with a more traditional language like\n 1849 01:47:13,680 --> 01:47:17,090 Last week I described all of the\n 1850 01:47:17,091 --> 01:47:19,881 that you see in this, like the\n 1851 01:47:19,881 --> 01:47:22,761 parentheses, curly braces,\nbackslash n, and more. 1852 01:47:22,761 --> 01:47:26,791 Well, today we're not going to reveal\n 1853 01:47:27,291 --> 01:47:31,461 But by next week, will this no\n 1854 01:47:31,461 --> 01:47:35,391 to you, a language that, presumably,\n 1855 01:47:36,320 --> 01:47:40,201 But to do that, we'll explore some\n 1856 01:47:40,201 --> 01:47:43,221 So recall that, via Scratch-- and\npresumably via problem set 1-- 1857 01:47:43,220 --> 01:47:46,580 we took a look at things called\n 1858 01:47:46,581 --> 01:47:49,310 And related to functions\nwere arguments like inputs. 1859 01:47:49,310 --> 01:47:52,760 And related to some functions\nwere returned values like outputs. 1860 01:47:52,761 --> 01:47:56,151 Then we talked a bit about conditionals,\n 1861 01:47:56,150 --> 01:47:59,541 Boolean expressions, which are\n 1862 01:47:59,541 --> 01:48:03,141 questions, loops, which let you do\n 1863 01:48:03,140 --> 01:48:05,990 like in math, that let you\nstore values temporarily 1864 01:48:05,990 --> 01:48:07,671 and then even other topics still. 1865 01:48:07,671 --> 01:48:11,421 So if you were comfortable on the\n 1866 01:48:11,421 --> 01:48:14,490 realize that all of these topics\nare going to remain with us. 1867 01:48:14,490 --> 01:48:18,230 So really, today is just about acquiring\n 1868 01:48:18,230 --> 01:48:22,911 you translate those ideas into,\n 1869 01:48:22,911 --> 01:48:25,698 a new syntax, frankly,\nthat's actually more 1870 01:48:25,698 --> 01:48:27,740 simple in some ways than\nyour own human language 1871 01:48:27,740 --> 01:48:31,422 be it English or something else, because\n 1872 01:48:31,422 --> 01:48:33,380 There's actually far less\nsyntax that you might 1873 01:48:33,381 --> 01:48:35,451 have in, say, a typical human language. 1874 01:48:35,451 --> 01:48:39,600 But you need to be with these computer\n 1875 01:48:39,600 --> 01:48:41,850 so that you're most,\nultimately, correct 1876 01:48:41,850 --> 01:48:46,040 and ultimately will see to your code\n 1877 01:48:46,621 --> 01:48:49,791 So if you think about the last time\n 1878 01:48:49,791 --> 01:48:51,980 knowing what you were doing\nor encountered something new-- 1879 01:48:51,980 --> 01:48:54,855 might not have been that long ago,\n 1880 01:48:54,855 --> 01:48:58,460 first time, or Old Campus or the like,\n 1881 01:48:58,461 --> 01:49:01,823 you didn't really need to know how\n 1882 01:49:01,823 --> 01:49:03,530 You didn't need to\nknow who everyone was 1883 01:49:03,530 --> 01:49:07,230 where everything was, how Harvard or\n 1884 01:49:07,730 --> 01:49:10,855 You sort of got by day to day by just\n 1885 01:49:10,855 --> 01:49:12,605 And anything you didn't\nreally understand 1886 01:49:12,605 --> 01:49:14,960 you sort of turned a blind\neye to until it's important. 1887 01:49:14,961 --> 01:49:16,761 And that's, indeed, what\nwe're going to do today. 1888 01:49:16,761 --> 01:49:18,636 And really, for the next\nseveral weeks, we'll 1889 01:49:18,636 --> 01:49:21,171 focus on details that\nare initially important 1890 01:49:21,171 --> 01:49:24,342 and try to wave our hands, so to speak,\n 1891 01:49:24,342 --> 01:49:25,800 we'll get to, might be interesting. 1892 01:49:25,801 --> 01:49:27,468 But for now, they might be distractions. 1893 01:49:27,467 --> 01:49:29,900 And by distractions, I really\nmean some of that syntax 1894 01:49:31,501 --> 01:49:34,671 So by the end of today-- and\n 1895 01:49:34,671 --> 01:49:37,400 your first foray, presumably,\ninto this language called C-- 1896 01:49:37,400 --> 01:49:39,171 you'll have written some code. 1897 01:49:39,171 --> 01:49:41,780 And you'll be asking yourself--\nwe'll be asking yourselves-- 1898 01:49:43,161 --> 01:49:46,041 Well, first and foremost, per\nlast week, be it in Scratch 1899 01:49:46,041 --> 01:49:51,591 or phone book form, code ultimately\n 1900 01:49:51,591 --> 01:49:53,541 You want the problem\nto be solved correctly. 1901 01:49:53,541 --> 01:49:55,490 So that one sort of goes without saying. 1902 01:49:55,490 --> 01:49:59,511 And along the way this term, we'll\n 1903 01:49:59,511 --> 01:50:02,746 so you don't have to just sit there\n 1904 01:50:02,746 --> 01:50:05,371 checking the output, trying\nanother input, checking the output. 1905 01:50:05,371 --> 01:50:07,161 There's a lot of automation\ntools in the real world-- 1906 01:50:07,161 --> 01:50:09,201 and in this class and\nothers like it-- that 1907 01:50:09,201 --> 01:50:13,011 will help facilitate you answering\n 1908 01:50:13,011 --> 01:50:15,901 correct, according to our\nspecifications or the like. 1909 01:50:15,900 --> 01:50:18,440 But then something that's\ngoing to take more time 1910 01:50:18,440 --> 01:50:21,860 and you're probably not going to feel\n 1911 01:50:21,860 --> 01:50:24,890 the first weeks, is just how\nwell designed your code is. 1912 01:50:24,890 --> 01:50:27,613 It's one thing to speak\nEnglish or write English 1913 01:50:27,613 --> 01:50:30,030 but it's another thing-- or\nany language, for that matter. 1914 01:50:30,030 --> 01:50:32,190 But it's another thing to\nspeak it or write it well. 1915 01:50:32,190 --> 01:50:35,148 And we spend all these years in middle\n 1916 01:50:35,149 --> 01:50:38,601 writing papers and other documents,\n 1917 01:50:38,600 --> 01:50:41,810 as to how well formulated your\n 1918 01:50:41,810 --> 01:50:43,019 your paper was, and the like. 1919 01:50:43,020 --> 01:50:44,971 And there's that same\nidea in programming. 1920 01:50:44,970 --> 01:50:49,590 It doesn't matter necessarily that\n 1921 01:50:49,591 --> 01:50:53,615 If your code is a complete visual\nmess, or if it's crazy long 1922 01:50:53,615 --> 01:50:55,490 it's going to be really\nhard for someone else 1923 01:50:55,490 --> 01:50:59,121 to wrap their mind around what your code\n 1924 01:51:00,201 --> 01:51:04,011 And honestly, you-- the\nnext morning, the next year 1925 01:51:04,011 --> 01:51:07,551 the next time you look at\nthat code-- might have no idea 1926 01:51:07,551 --> 01:51:09,381 what you yourself were even thinking. 1927 01:51:09,381 --> 01:51:13,731 But you will if you focus,\ntoo, on designing good code 1928 01:51:13,730 --> 01:51:16,640 getting your algorithms efficient,\n 1929 01:51:16,640 --> 01:51:19,161 and even making sure your\ncode looks pretty, which 1930 01:51:19,161 --> 01:51:20,881 we'd describe as a matter of style. 1931 01:51:20,881 --> 01:51:24,514 So in the written human world, having\n 1932 01:51:24,514 --> 01:51:27,181 capitalization and the like-- the\nsort of way you write an essay 1933 01:51:27,180 --> 01:51:29,220 but not necessarily\nsend a text message-- 1934 01:51:29,220 --> 01:51:31,420 relates to style, for instance. 1935 01:51:31,421 --> 01:51:33,360 And so good style in\ncode is going to have 1936 01:51:33,360 --> 01:51:36,961 a few of these characteristics that are\n 1937 01:51:36,961 --> 01:51:41,981 But you just have to start to get in the\n 1938 01:51:41,980 --> 01:51:44,940 So these three axes, so to speak,\n 1939 01:51:44,940 --> 01:51:48,121 are really the overarching\ngoals when writing code that 1940 01:51:48,121 --> 01:51:50,201 ultimately is going to look like this. 1941 01:51:50,201 --> 01:51:52,201 So this program we\nconjectured last week does 1942 01:51:52,201 --> 01:51:56,863 what if you run it on a Mac or\nPC or somewhere else, presumably? 1943 01:52:00,261 --> 01:52:02,136 DAVID J. MALAN: It just\nprints, Hello, world. 1944 01:52:02,136 --> 01:52:04,311 And honestly, that's kind\nof atrocious that you 1945 01:52:04,310 --> 01:52:08,060 need to hit your keyboard keys this\n 1946 01:52:08,060 --> 01:52:09,740 to get a program to say, Hello, world. 1947 01:52:09,740 --> 01:52:11,990 So a spoiler-- in a\nfew weeks' time when we 1948 01:52:11,990 --> 01:52:14,451 introduce other, more modern\nlanguages, like Python 1949 01:52:14,451 --> 01:52:18,631 you can distill this same logic\ninto literally one line of code. 1950 01:52:18,631 --> 01:52:20,301 And so we're getting there, ultimately. 1951 01:52:20,301 --> 01:52:23,294 But it's helpful to understand\nwhat it is that's going on here 1952 01:52:23,293 --> 01:52:25,461 because even though this\nis a pretty cryptic syntax 1953 01:52:25,461 --> 01:52:28,101 there's nothing after this week and,\n 1954 01:52:28,100 --> 01:52:30,920 be able to understand even about\nsomething that right now looks 1955 01:52:30,921 --> 01:52:32,311 a little something like this. 1956 01:52:33,644 --> 01:52:35,811 Well, I've given us sort\nof the answer to a problem. 1957 01:52:35,810 --> 01:52:37,728 How do you print, Hello,\nworld, on the screen? 1958 01:52:37,728 --> 01:52:39,177 So what do I do with this code? 1959 01:52:39,177 --> 01:52:42,260 Well, we're in the habit of typically\n 1960 01:52:43,911 --> 01:52:48,030 And yeah, I could open up Word or\n 1961 01:52:48,030 --> 01:52:50,780 and just literally transcribe\nthat character for character 1962 01:52:50,780 --> 01:52:53,390 save it, and boom, I've got a program. 1963 01:52:53,390 --> 01:52:56,630 But the problem, per last week, is\n 1964 01:52:56,631 --> 01:52:59,164 what other language, so to speak? 1965 01:53:00,081 --> 01:53:02,310 DAVID J. MALAN: Yeah, so\nbinary, zeros and ones. 1966 01:53:02,310 --> 01:53:04,310 And so this, obviously,\nis not zeros and ones. 1967 01:53:04,310 --> 01:53:07,040 So it doesn't matter if I put it in\n 1968 01:53:07,632 --> 01:53:10,050 The computer is not going to\nunderstand it until I somehow 1969 01:53:10,050 --> 01:53:11,570 translate it to zeros and ones. 1970 01:53:11,570 --> 01:53:14,121 And honestly, none of those\ntools that I rattled off 1971 01:53:14,121 --> 01:53:15,980 are really appropriate for programming. 1972 01:53:16,551 --> 01:53:19,011 Well, they come with features\nlike bold facing and italics 1973 01:53:19,011 --> 01:53:22,881 and sort of fluffy, aesthetic stuff\n 1974 01:53:22,881 --> 01:53:24,631 you're trying to do with your code. 1975 01:53:24,631 --> 01:53:26,661 And they don't have\nthe ability, it would 1976 01:53:26,661 --> 01:53:29,661 seem, to convert that code\nultimately to zeros and ones. 1977 01:53:29,661 --> 01:53:32,451 But tools that do have\nthis capability might 1978 01:53:32,451 --> 01:53:36,350 be called Integrated Development\nEnvironments, or IDEs 1979 01:53:36,350 --> 01:53:38,180 or, more simply, text editors. 1980 01:53:38,180 --> 01:53:42,560 A text editor is a tool that a\nprogrammer uses perhaps every day 1981 01:53:44,030 --> 01:53:46,850 And it's a simple program-- here,\n 1982 01:53:46,850 --> 01:53:49,542 called Visual Studio Code, or VS Code. 1983 01:53:49,542 --> 01:53:51,500 And at the top here, you\nsee that I've actually 1984 01:53:51,501 --> 01:53:56,391 created in advance before class a very\n 1985 01:53:57,020 --> 01:54:00,740 Well, .c indicates by convention that\n 1986 01:54:01,640 --> 01:54:06,050 It's not .docx, which would mean in\n 1987 01:54:08,060 --> 01:54:12,621 This is .c, which means in this file is\n 1988 01:54:12,621 --> 01:54:16,070 C. This number 1 here is just an\n 1989 01:54:16,070 --> 01:54:18,440 me keep track of how long\nor short this program is. 1990 01:54:18,440 --> 01:54:21,050 And the cursor is just\nblinking there, waiting 1991 01:54:21,051 --> 01:54:23,030 for me to start typing some code. 1992 01:54:23,030 --> 01:54:26,030 Well, let me go ahead and type\nout exactly the same code. 1993 01:54:26,030 --> 01:54:28,260 For me, it comes pretty\ncomfortably from memory. 1994 01:54:28,261 --> 01:54:31,791 So I'm going to go ahead and include\n 1995 01:54:32,961 --> 01:54:37,221 I'm going to magically type int\n 1996 01:54:37,220 --> 01:54:38,610 we'll come back to that later-- 1997 01:54:38,610 --> 01:54:43,560 one of these curly braces and then a\n 1998 01:54:43,560 --> 01:54:46,610 Then I'm going to hit Tab\nto indent a few spaces. 1999 01:54:46,610 --> 01:54:52,850 And then I'm going to type not print,\n 2000 01:54:52,850 --> 01:54:55,850 close quote, close\nparenthesis, semicolon. 2001 01:54:55,850 --> 01:55:00,050 And I dare say this was essentially\n 2002 01:55:01,220 --> 01:55:03,291 I wrote it to say, "Hi, CS50. 2003 01:55:03,291 --> 01:55:06,201 Now it just says the more canonical,\n 2004 01:55:08,390 --> 01:55:11,240 And all I need to now do is\nmaybe hit Command-S or Control-S 2005 01:55:12,110 --> 01:55:15,081 And voila, I am a programmer. 2006 01:55:15,081 --> 01:55:17,828 The catch though, is,\nOK, how do I run this? 2007 01:55:17,828 --> 01:55:19,911 Like, on your Mac or PC,\nhow do you run a program? 2008 01:55:19,911 --> 01:55:21,411 Well, usually double-click an icon. 2009 01:55:21,411 --> 01:55:23,121 On your phone, you tap an icon. 2010 01:55:23,121 --> 01:55:26,810 In this environment that we're using\n 2011 01:55:26,810 --> 01:55:31,670 say most programmers-- use, you don't\n 2012 01:55:32,511 --> 01:55:35,734 That's very user friendly,\nbut it's not very necessary. 2013 01:55:35,734 --> 01:55:38,151 Especially when you get more\ncomfortable with programming 2014 01:55:38,150 --> 01:55:41,030 you're going to want to type commands\n 2015 01:55:42,020 --> 01:55:43,853 And you're going to\nwant to automate things 2016 01:55:43,854 --> 01:55:46,791 which is a lot easier if it's all\n 2017 01:55:46,791 --> 01:55:49,131 to mouse and muscular movements. 2018 01:55:49,131 --> 01:55:52,011 And so here I have my program. 2019 01:55:52,011 --> 01:55:54,591 It lives in this file called "hello.c. 2020 01:55:54,591 --> 01:55:58,131 I need to now convert it,\nthough, to zeros and ones. 2021 01:55:58,131 --> 01:56:02,331 Well, how do I go about doing this,\n 2022 01:56:03,680 --> 01:56:06,470 or source code, as it's\nconventionally called-- 2023 01:56:06,470 --> 01:56:11,040 to this, these zeros and ones that\n 2024 01:56:11,041 --> 01:56:13,490 The zeros and ones from last\nweek can be used not only 2025 01:56:13,490 --> 01:56:18,751 to represent numbers and letters,\n 2026 01:56:18,751 --> 01:56:24,051 It can also represent instructions to a\n 2027 01:56:24,051 --> 01:56:26,091 or delete a file, or save a file. 2028 01:56:26,091 --> 01:56:28,761 All the sort of basics\nof a computer somehow 2029 01:56:28,761 --> 01:56:32,361 can be represented by other\npatterns of zeros and ones. 2030 01:56:32,360 --> 01:56:34,934 And just like last week,\nit depends on the context 2031 01:56:34,934 --> 01:56:36,351 in which these numbers are stored. 2032 01:56:36,350 --> 01:56:39,680 Sometimes they're interpreted as\nnumbers, like in a spreadsheet. 2033 01:56:39,680 --> 01:56:41,390 Sometimes they're interpreted as colors. 2034 01:56:41,390 --> 01:56:46,040 Sometimes they're interpreted as\n 2035 01:56:46,041 --> 01:56:50,820 to do very low-level operations,\n 2036 01:56:50,820 --> 01:56:55,610 So fortunately, last week's definition\n 2037 01:56:55,610 --> 01:56:58,020 is a nice mental model for\nexactly the goal at hand. 2038 01:56:58,020 --> 01:57:00,800 I have some input, AKA source code. 2039 01:57:00,801 --> 01:57:05,001 I want to output ultimately\nmachine code, those zeros and ones. 2040 01:57:05,001 --> 01:57:07,581 I certainly don't want to do\nthis kind of process by hand. 2041 01:57:07,581 --> 01:57:11,181 So hopefully there's an algorithm\n 2042 01:57:12,626 --> 01:57:14,751 And those of you who do\nhave some prior experience 2043 01:57:14,751 --> 01:57:16,947 this program might be called a? 2044 01:57:18,150 --> 01:57:20,150 So a few of you have,\nindeed, programmed before. 2045 01:57:20,150 --> 01:57:21,831 Not all languages use compilers. 2046 01:57:21,831 --> 01:57:24,351 C, in fact, is a language\nthat does use a compiler. 2047 01:57:24,350 --> 01:57:27,680 And so I just need to find myself-- 2048 01:57:27,680 --> 01:57:31,280 on my computer somewhere,\npresumably-- a so-called compiler 2049 01:57:31,280 --> 01:57:35,730 a program whose purpose in life is\n 2050 01:57:35,730 --> 01:57:40,460 And source code written textually\n 2051 01:57:41,661 --> 01:57:44,490 The machine code is the\ncorresponding zeros and ones. 2052 01:57:44,490 --> 01:57:48,261 So let me go back to the same\nprogramming environment called 2053 01:57:48,261 --> 01:57:50,421 Visual Studio Code or VS Code. 2054 01:57:50,421 --> 01:57:53,541 This is typically a program you\n 2055 01:57:53,541 --> 01:57:57,351 can download onto their own Mac or\n 2056 01:57:57,350 --> 01:57:59,750 computer you own writing some code. 2057 01:57:59,751 --> 01:58:02,181 A downside, though, of that\napproach is that all of us 2058 01:58:02,180 --> 01:58:04,790 have slightly different\nversions of Macs or PCs. 2059 01:58:04,791 --> 01:58:07,371 We have slightly different\nversions of operating systems. 2060 01:58:07,371 --> 01:58:08,931 They may or may not be up to date. 2061 01:58:08,930 --> 01:58:13,020 It's just a technical support nightmare\n 2062 01:58:13,020 --> 01:58:15,831 especially for an introductory\n 2063 01:58:15,831 --> 01:58:19,201 be on the same page so we can\nget you up and running quickly. 2064 01:58:19,201 --> 01:58:23,180 And so I'm actually using a cloud-based\n 2065 01:58:23,180 --> 01:58:25,970 that you only need a browser to access. 2066 01:58:25,970 --> 01:58:28,760 And then you can be on any\ncomputer, today or tomorrow. 2067 01:58:28,761 --> 01:58:32,341 By the end of the semester, we're\n 2068 01:58:32,341 --> 01:58:36,021 so to speak, as best we can and\nget you onto your own Mac or PC 2069 01:58:36,020 --> 01:58:39,530 so that after this class, especially if\n 2070 01:58:39,530 --> 01:58:43,131 you feel like you can continue\n 2071 01:58:45,270 --> 01:58:47,960 But for now, wonderfully, the\nbrowser version of VS Code 2072 01:58:47,961 --> 01:58:51,793 should pretty much be identical\n 2073 01:58:51,792 --> 01:58:53,000 version of the same would be. 2074 01:58:53,001 --> 01:58:55,521 And you'll see in problem\nset 1 how to access this 2075 01:58:55,520 --> 01:58:58,791 and how to get going yourself\nwith your first programs. 2076 01:58:58,791 --> 01:59:01,641 But I haven't mentioned this\nbottom part of the screen 2077 01:59:01,640 --> 01:59:03,140 this bottom part of the screen. 2078 01:59:03,140 --> 01:59:06,570 And this is an area where we have\n 2079 01:59:06,570 --> 01:59:10,400 So this is sort of old-school technology\n 2080 01:59:10,400 --> 01:59:15,380 to interact with a computer, wherever\n 2081 01:59:15,381 --> 01:59:17,601 or even, in this case, in the cloud. 2082 01:59:17,600 --> 01:59:20,150 So on the top-hand\nportion of this screen 2083 01:59:20,150 --> 01:59:24,740 is my text editor, like tabbed\n 2084 01:59:24,740 --> 01:59:26,930 I can just create files and write code. 2085 01:59:26,930 --> 01:59:30,050 The bottom of the screen here,\nmy so-called terminal window 2086 01:59:30,051 --> 01:59:33,141 gives me the ability to\nrun commands on a server 2087 01:59:33,140 --> 01:59:35,600 that currently I have\nexclusive access to. 2088 01:59:35,600 --> 01:59:39,680 So because I logged into VS\nCode with my account online 2089 01:59:39,680 --> 01:59:44,451 I have my own sort of virtual\n 2090 01:59:44,451 --> 01:59:46,701 otherwise known as, in\nthis context, a container. 2091 01:59:46,701 --> 01:59:49,770 This has its own operating system\nfor me, its own hard drive 2092 01:59:49,770 --> 01:59:52,640 if you will, where I can save\nand create files of my own 2093 01:59:52,640 --> 01:59:54,990 separate from yours and vice versa. 2094 01:59:54,990 --> 01:59:57,381 And it's at this very\nsimple prompt, which 2095 01:59:57,381 --> 02:00:00,218 is conventionally-- but not always--\n 2096 02:00:00,217 --> 02:00:01,550 has nothing to do with currency. 2097 02:00:01,551 --> 02:00:03,644 It just means, type your commands here. 2098 02:00:03,644 --> 02:00:05,811 This is where I'm going to\nbe able to type commands 2099 02:00:05,810 --> 02:00:09,871 like compile my source\ncode into machine code. 2100 02:00:09,871 --> 02:00:15,523 So it's a Command Line Interface, or\n 2101 02:00:15,523 --> 02:00:18,230 that you might not have ever used\n 2102 02:00:19,220 --> 02:00:23,730 Odds are almost all of us in this room\n 2103 02:00:23,730 --> 02:00:26,990 but we're all going to start using an\n 2104 02:00:26,990 --> 02:00:30,140 is in a family of operating systems\n 2105 02:00:30,140 --> 02:00:33,230 line interface, but are used not\n 2106 02:00:33,230 --> 02:00:35,850 websites and developing\napplications and the like. 2107 02:00:35,850 --> 02:00:40,290 And it's, indeed, a familiar and very\n 2108 02:00:40,291 --> 02:00:45,801 So how do I go about making this\nfile, hello.c, into a program? 2109 02:00:45,801 --> 02:00:48,740 There's no icon to double-click,\nbut there is a command. 2110 02:00:48,740 --> 02:00:54,140 I can type, make hello, at this dollar\n 2111 02:00:54,140 --> 02:00:56,300 and nothing appears to happen. 2112 02:00:57,533 --> 02:00:59,490 And as we'll see in\nprogramming, almost always 2113 02:00:59,490 --> 02:01:02,661 if you don't see anything go wrong,\n 2114 02:01:02,661 --> 02:01:04,400 So this is going to\nbe a rarity at first 2115 02:01:04,400 --> 02:01:07,490 but this is a good thing that\nit just seems to do nothing. 2116 02:01:07,490 --> 02:01:11,661 But now there is in the\nfolder in my accounts 2117 02:01:11,661 --> 02:01:15,261 in this on the cloud\na file called "hello. 2118 02:01:15,261 --> 02:01:18,771 And it's a bit of a weird command,\n 2119 02:01:19,581 --> 02:01:22,581 . just means go into my current folder. 2120 02:01:22,581 --> 02:01:28,371 /hello means run the program called\n 2121 02:01:28,371 --> 02:01:32,871 So ./hello, and then Enter, and voila,\n 2122 02:01:40,110 --> 02:01:42,770 I'm going to go ahead and open\nup the sidebar of this program 2123 02:01:42,770 --> 02:01:44,911 and you'll see in problem\nset 1 how to do this. 2124 02:01:44,911 --> 02:01:48,051 And this might look a little different\n 2125 02:01:48,051 --> 02:01:51,191 Even the color scheme I'm using might\n 2126 02:01:51,190 --> 02:01:52,940 because it supports a\nnice colorful theme. 2127 02:01:52,940 --> 02:01:56,720 So you can have different colors\nand brightnesses depending 2128 02:01:56,720 --> 02:01:58,250 on your mood or the time of day. 2129 02:01:58,251 --> 02:02:01,641 What I've opened here, though, is\n 2130 02:02:01,640 --> 02:02:04,278 and this is just all of the\nfiles in my cloud account. 2131 02:02:04,279 --> 02:02:05,571 And there's not many right now. 2132 02:02:06,576 --> 02:02:09,081 One is the file called\nhello.c, and it's highlighted 2133 02:02:09,081 --> 02:02:10,851 because I've got it open right there. 2134 02:02:10,850 --> 02:02:14,330 And the other is a file called\n"hello," which is brand new 2135 02:02:14,331 --> 02:02:17,331 and was created when I ran that command. 2136 02:02:17,331 --> 02:02:21,431 And what's now worth noting is that\n 2137 02:02:22,780 --> 02:02:26,140 Like on the left-hand side, you have\n 2138 02:02:26,140 --> 02:02:30,275 But on the bottom here, again, you\n 2139 02:02:30,275 --> 02:02:32,650 These are just different ways\nto interact with computers 2140 02:02:32,650 --> 02:02:34,193 and you'll get comfortable with both. 2141 02:02:34,193 --> 02:02:37,451 And honestly, you're certainly familiar\n 2142 02:02:37,451 --> 02:02:40,911 so it's the command line one\nwith which we'll spend some time. 2143 02:02:40,911 --> 02:02:43,600 Now suppose that I just\nwanted to do something 2144 02:02:43,600 --> 02:02:45,438 more than compile this program. 2145 02:02:45,439 --> 02:02:47,231 Suppose I wanted to go\nahead and remove it. 2146 02:02:47,230 --> 02:02:48,701 Like, uh-uh, no, I made a mistake. 2147 02:02:48,701 --> 02:02:51,070 I want to say, "Hello,\nCS50," not "Hello, world. 2148 02:02:51,070 --> 02:02:55,121 I could just hover up here, like in\n 2149 02:02:55,121 --> 02:02:57,881 and I could poke around, and\nthere, delete permanently. 2150 02:02:57,881 --> 02:03:00,371 So most of us might have\nthat instinct on a Mac or PC. 2151 02:03:00,371 --> 02:03:02,801 You right-click or Control-click,\nand you poke around. 2152 02:03:02,801 --> 02:03:05,951 But in a command line interface,\nlet me do this instead. 2153 02:03:05,951 --> 02:03:08,440 The command for removing\nor deleting a file 2154 02:03:08,440 --> 02:03:12,100 in the world of Linux, this\nother operating system 2155 02:03:12,100 --> 02:03:16,330 is just a type rm for remove,\nand then "hello," Enter. 2156 02:03:16,331 --> 02:03:19,091 It's a somewhat cryptic confirmation\n 2157 02:03:19,961 --> 02:03:21,761 I'm going to go ahead\nand type Y for Yes. 2158 02:03:21,761 --> 02:03:24,191 And now when I hit\nEnter, watch what happens 2159 02:03:24,190 --> 02:03:28,240 at top left in the Explorer, the\nGUI, the graphical interface. 2160 02:03:30,789 --> 02:03:32,980 Not terribly exciting,\nbut this just means 2161 02:03:32,980 --> 02:03:35,539 this is a graphical version\nof what we're seeing here. 2162 02:03:35,539 --> 02:03:38,800 And in fact, if you want to\nnever use the GUI again-- 2163 02:03:38,800 --> 02:03:41,690 I'll go ahead and close it\nwith a keyboard shortcut here-- 2164 02:03:41,690 --> 02:03:45,430 you can forever just type\nls for list and hit Enter. 2165 02:03:45,430 --> 02:03:48,100 And you will see in the\ncommand line interface 2166 02:03:48,100 --> 02:03:50,617 all of the files in your current folder. 2167 02:03:50,618 --> 02:03:52,451 So anything you can do\nwith a mouse, you can 2168 02:03:52,451 --> 02:03:54,039 do with this command line interface. 2169 02:03:54,039 --> 02:03:57,411 And indeed, we'll see many more\nthings that you can do as well. 2170 02:03:57,411 --> 02:04:01,091 But the inventors of this, this\n 2171 02:04:02,291 --> 02:04:04,360 Like, the command is rm for remove. 2172 02:04:07,390 --> 02:04:10,480 Because it's just faster to type. 2173 02:04:10,480 --> 02:04:14,260 So before we forge ahead with making\n 2174 02:04:14,261 --> 02:04:16,181 just "Hello, world,"\nlet me pause here to see 2175 02:04:16,180 --> 02:04:19,420 if there's questions on\nsource code or machine 2176 02:04:19,421 --> 02:04:24,221 code or compiler or this\ncommand line interface. 2177 02:04:26,628 --> 02:04:28,921 DAVID J. MALAN: Really good\nquestion, and let me recap. 2178 02:04:28,921 --> 02:04:30,751 If I were to make\nchanges to the program 2179 02:04:30,751 --> 02:04:33,871 run it, and then maybe make other\nchanges and try to rerun it 2180 02:04:33,871 --> 02:04:37,028 would those changes be reflected,\n 2181 02:04:37,860 --> 02:04:39,961 I already removed the old version. 2182 02:04:39,961 --> 02:04:43,530 So let me go ahead and point\nout that if I do ./hello now 2183 02:04:43,530 --> 02:04:47,760 I'm going to see some kind of error\n 2184 02:04:47,761 --> 02:04:50,431 No such file or directory, so\nit's not terribly user friendly 2185 02:04:50,430 --> 02:04:51,960 but it's saying what the problem is. 2186 02:04:51,961 --> 02:04:55,231 Let me go ahead and remake\nit by typing make hello. 2187 02:04:55,230 --> 02:04:59,911 Now if I type ls, I'll see not one\n 2188 02:04:59,911 --> 02:05:03,615 is even green with a little asterisk\n 2189 02:05:03,615 --> 02:05:05,490 It's sort of the textual\nversion of something 2190 02:05:05,490 --> 02:05:07,251 you could double-click\nin our human world. 2191 02:05:07,251 --> 02:05:10,501 So now, of course, if I run hello, we're\n 2192 02:05:10,501 --> 02:05:15,811 But now suppose I change it to\n 2193 02:05:15,810 --> 02:05:21,000 Let me go ahead and save the file with\n 2194 02:05:21,001 --> 02:05:23,681 let me run ./hello again, and voila. 2195 02:05:25,171 --> 02:05:27,431 So let me ask someone else\nto answer that question. 2196 02:05:29,011 --> 02:05:31,051 Why did it not say, "Hello, CS50. 2197 02:05:33,171 --> 02:05:35,240 DAVID J. MALAN: Yeah, so\nI didn't compile it again. 2198 02:05:35,240 --> 02:05:38,421 So sort of newbie mistake, you're going\n 2199 02:05:38,990 --> 02:05:42,051 But now let me go ahead\nand remake hello, enter. 2200 02:05:42,051 --> 02:05:45,171 It's going to seemingly\nmake the same program. 2201 02:05:45,171 --> 02:05:49,431 But this time when I run\nit, it\'s, "Hello, CS50. 2202 02:05:49,430 --> 02:05:53,180 Any other questions on some\nof these building blocks? 2203 02:05:53,180 --> 02:05:56,580 And we'll come back to all the\ncrazy syntax I typed before long. 2204 02:05:56,581 --> 02:05:58,881 But for now, we're focusing\non just the output. 2205 02:06:00,853 --> 02:06:02,560 DAVID J. MALAN: When\nI keep running make 2206 02:06:02,560 --> 02:06:05,510 it creates a new version\nof the machine code. 2207 02:06:05,511 --> 02:06:09,490 So it keeps changing the hello program\n 2208 02:06:09,490 --> 02:06:12,599 There's no make file, per se. 2209 02:06:13,634 --> 02:06:15,051 DAVID J. MALAN: Good question, no. 2210 02:06:15,051 --> 02:06:18,141 If I open up that directory, you'll\n 2211 02:06:18,140 --> 02:06:21,230 And it doesn't matter how\nmany times I run make hello-- 2212 02:06:21,230 --> 02:06:25,251 three, four, five-- it just\nkeeps overwriting the original. 2213 02:06:25,251 --> 02:06:28,753 So it's kind of like just saving in\n 2214 02:06:29,461 --> 02:06:31,003 But there's an additional step today. 2215 02:06:31,002 --> 02:06:36,280 We have to then convert my words to\n 2216 02:06:38,541 --> 02:06:40,730 DAVID J. MALAN: Oh, what\nhappens if I run hello.c? 2217 02:06:40,730 --> 02:06:45,081 So let me go ahead and do ./hello.c,\n 2218 02:06:48,801 --> 02:06:50,961 This is where the error\nmessages mean something 2219 02:06:50,961 --> 02:06:53,661 to the people who designed the operating\n 2220 02:06:53,661 --> 02:06:55,701 It's not that you don't\nhave access to the file. 2221 02:06:55,701 --> 02:06:57,411 It means that it's not executable. 2222 02:06:57,411 --> 02:07:00,201 This is not something you\nhave permission to run 2223 02:07:00,201 --> 02:07:04,743 but you do have permission to read\n 2224 02:07:06,520 --> 02:07:08,228 DAVID J. MALAN: Oh,\nreally good question. 2225 02:07:08,229 --> 02:07:11,801 So if I have named my file, hello\n 2226 02:07:11,801 --> 02:07:15,761 dot C, of the things that\nMake does is it automatically 2227 02:07:18,911 --> 02:07:20,711 we'll discuss this a bit more next week. 2228 02:07:20,711 --> 02:07:23,711 Make itself-- is kind of the\nfirst of white lies today-- 2229 02:07:25,480 --> 02:07:30,820 It's a program that knows how to find\n 2230 02:07:30,820 --> 02:07:33,161 and automatically create the program. 2231 02:07:33,161 --> 02:07:37,091 If I use, as we'll discuss next\n 2232 02:07:37,091 --> 02:07:41,081 I have to type a much longer sequence\n 2233 02:07:41,081 --> 02:07:43,270 what do I want the name\nof my program to be. 2234 02:07:43,270 --> 02:07:45,310 Make is a nice program,\nespecially in week 1 2235 02:07:45,310 --> 02:07:47,661 because it just automates\nall of that for us. 2236 02:07:47,661 --> 02:07:51,221 And so here, we have now a\nprogram that very simply prints 2237 02:07:52,220 --> 02:07:54,790 So let's not put this\ninto the context of where 2238 02:07:54,791 --> 02:07:58,871 we left off last time in the context\n 2239 02:07:58,871 --> 02:08:01,961 So we discuss the last time, of\ncourse, functions and arguments. 2240 02:08:01,961 --> 02:08:07,011 Functions, again, are those actions and\n 2241 02:08:07,011 --> 02:08:09,403 And the arguments were the\ninputs to those functions 2242 02:08:09,403 --> 02:08:11,860 generally in those little white\novals that, in Scratch, you 2243 02:08:11,860 --> 02:08:13,961 could type words or numbers into. 2244 02:08:13,961 --> 02:08:16,961 We'll see, in all of the languages\nwe're going to see this term 2245 02:08:18,283 --> 02:08:20,990 And let's just start to translate\n 2246 02:08:20,990 --> 02:08:24,911 So for instance, let's put\nthis same program in C 2247 02:08:26,110 --> 02:08:29,621 This is what Hello, World looked like\n 2248 02:08:29,621 --> 02:08:32,530 This week, of course,\nit looks like print. 2249 02:08:32,530 --> 02:08:34,990 And then the parentheses,\nnotice, are kind of 2250 02:08:34,990 --> 02:08:38,679 deliberately designed in the world of\n 2251 02:08:38,679 --> 02:08:40,721 Even though this is a\nwhite oval, you kind of get 2252 02:08:40,720 --> 02:08:45,400 that it's kind of evoking that\nsame idea with the parentheses. 2253 02:08:45,400 --> 02:08:49,180 Technically the function\nin C, it's not called say. 2254 02:08:51,671 --> 02:08:55,061 The F stands for formatted, but we'll\n 2255 02:08:55,060 --> 02:08:57,610 But printf is the closest\nanalogous function 2256 02:08:57,610 --> 02:09:00,881 for say in the world of\nC. Notice if, though, you 2257 02:09:00,881 --> 02:09:05,501 want to print something like\nHello, World or Hello CS50 in C 2258 02:09:05,501 --> 02:09:08,980 you don't just write the\nwords as we did last week. 2259 02:09:08,980 --> 02:09:11,411 You also had an add what,\nif you notice already 2260 02:09:11,411 --> 02:09:13,360 what's missing from this version. 2261 02:09:13,360 --> 02:09:15,680 Yeah, so the double quotes\non the left and the right. 2262 02:09:15,680 --> 02:09:20,998 So, that's necessary in C whenever\nyou have a string of words. 2263 02:09:20,998 --> 02:09:22,541 And I'm using that word deliberately. 2264 02:09:22,541 --> 02:09:26,831 Whenever you have multiple words\n 2265 02:09:27,461 --> 02:09:30,971 And you have to put it in double\nquotes, not single quotes. 2266 02:09:30,970 --> 02:09:32,680 You have to put it in double quotes. 2267 02:09:32,680 --> 02:09:36,880 There's one other stupid thing\nthat we need to have in my C code 2268 02:09:36,881 --> 02:09:40,971 in order to get this function to do\n 2269 02:09:41,841 --> 02:09:44,240 So just like in our human\nworld, you eventually 2270 02:09:44,240 --> 02:09:47,270 got into the habit of using, at\n 2271 02:09:47,270 --> 02:09:50,690 Semicolon is generally what\nyou use to finish your thought 2272 02:09:50,690 --> 02:09:53,780 in the world of programming with C. 2273 02:09:53,780 --> 02:09:55,911 All right, so we have\nthat function in place. 2274 02:09:55,911 --> 02:09:59,661 Now, what does this really fit\n 2275 02:09:59,661 --> 02:10:01,070 Well, functions take arguments. 2276 02:10:01,070 --> 02:10:04,760 And it turns out functions can\nhave different types of outputs. 2277 02:10:04,761 --> 02:10:07,071 And we've actually seen\nboth already last week. 2278 02:10:07,070 --> 02:10:11,360 One type of output from a function\n 2279 02:10:11,360 --> 02:10:13,430 And it generally refers\nto something visual 2280 02:10:13,430 --> 02:10:17,570 like something appearing on the screen\n 2281 02:10:17,570 --> 02:10:20,211 It's sort of a side effect of\nthe function doing its thing. 2282 02:10:20,211 --> 02:10:24,231 And indeed, last week we saw this in\n 2283 02:10:24,230 --> 02:10:27,180 like Hello, World as\ninput to the say function. 2284 02:10:27,180 --> 02:10:31,040 And we saw on the screen Hello,\n 2285 02:10:32,930 --> 02:10:35,840 You can't actually do anything\nwith that visual output 2286 02:10:35,841 --> 02:10:38,581 other than consume it,\nvisually, with your human eyes. 2287 02:10:38,581 --> 02:10:43,251 But sometimes, recall last week, we\n 2288 02:10:43,251 --> 02:10:44,961 actually returned me some value. 2289 02:10:44,961 --> 02:10:47,001 Remember the ask, what's your name. 2290 02:10:47,001 --> 02:10:50,684 It handed me back whatever\nanswer the human typed in. 2291 02:10:50,684 --> 02:10:52,851 It didn't just arbitrarily\ndisplay it on the screen. 2292 02:10:52,850 --> 02:10:55,190 The cat didn't necessarily\nsay it on the screen. 2293 02:10:55,190 --> 02:11:01,640 It was stored, instead, in that special\n 2294 02:11:01,640 --> 02:11:05,510 Because some functions have not\nside effects but return values. 2295 02:11:05,511 --> 02:11:09,321 They hand you back an output\nthat you can use and reuse 2296 02:11:09,320 --> 02:11:11,961 unlike the side effect, which,\nagain displays and that's it. 2297 02:11:11,961 --> 02:11:14,461 You can't sort of catch\nit and hold on to it. 2298 02:11:14,461 --> 02:11:17,601 So, in the context of last\nweek, we had the ask block. 2299 02:11:17,600 --> 02:11:20,270 And that had this special\nanswer return value. 2300 02:11:20,270 --> 02:11:22,970 In C, we're going to\nsee in just a moment 2301 02:11:22,970 --> 02:11:25,400 we could translate this as follows. 2302 02:11:25,400 --> 02:11:29,001 The closest match I can\npropose for the ask block 2303 02:11:29,001 --> 02:11:31,581 is a function that we're going\nto start calling get string. 2304 02:11:31,581 --> 02:11:34,730 String is, again, a word, a\nset of words, like a phrase 2305 02:11:34,730 --> 02:11:36,710 or a sentence in programming. 2306 02:11:36,711 --> 02:11:41,331 It, too, is a function insofar as\n 2307 02:11:41,331 --> 02:11:43,551 this isn't always true--\nbut very often when 2308 02:11:43,551 --> 02:11:48,261 you have a word in C followed by an open\n 2309 02:11:48,261 --> 02:11:50,961 it's most likely the name of a function. 2310 02:11:50,961 --> 02:11:53,461 And we're going to see that\nthere's some exceptions to that. 2311 02:11:53,461 --> 02:11:55,791 But for now this indeed\nlooks like a function 2312 02:11:55,791 --> 02:11:57,201 because it matches that pattern. 2313 02:11:57,201 --> 02:12:00,668 If I want to ask the question,\nwhat's your name, question mark-- 2314 02:12:00,668 --> 02:12:03,711 and I'm even going to deliberately\n 2315 02:12:03,711 --> 02:12:07,161 the cursor a little bit over so that\n 2316 02:12:08,190 --> 02:12:10,201 So that's just the nitpicky aesthetic. 2317 02:12:10,201 --> 02:12:14,480 This is perhaps the closest analog\nto just asking that question. 2318 02:12:14,480 --> 02:12:18,680 But because the ask\nblock returns a value 2319 02:12:18,680 --> 02:12:22,020 the analog here forget string is\nthat it, too, returns a value. 2320 02:12:22,020 --> 02:12:23,960 It doesn't just print the human's input. 2321 02:12:23,961 --> 02:12:28,131 It hands it back to you in the form\n 2322 02:12:28,131 --> 02:12:30,501 that I can then use and reuse. 2323 02:12:30,501 --> 02:12:33,471 Now ideally it would be as\nsimple as this literally 2324 02:12:33,470 --> 02:12:36,890 saying answer on the left equals. 2325 02:12:36,890 --> 02:12:38,780 And this is where\nthings start to diverge 2326 02:12:38,780 --> 02:12:40,640 from math and sort of our human world. 2327 02:12:40,640 --> 02:12:44,210 This equal sign, henceforth,\nis not the equal sign. 2328 02:12:44,211 --> 02:12:46,521 It is the assignment operator. 2329 02:12:46,520 --> 02:12:50,030 To assign a value means to\nstore a value in some variable. 2330 02:12:50,030 --> 02:12:53,640 And you read these things,\nweirdly, right to left. 2331 02:12:53,640 --> 02:12:55,820 So here is a function called get string. 2332 02:12:55,820 --> 02:12:58,310 I claim that it's going\nto return to you whatever 2333 02:12:58,310 --> 02:13:00,411 the human types in as their name. 2334 02:13:00,411 --> 02:13:03,320 It's going to get stored\nover here on the left because 2335 02:13:03,320 --> 02:13:06,331 of this so-called assignment\n 2336 02:13:06,331 --> 02:13:08,581 But it doesn't mean\nequality in this context. 2337 02:13:10,171 --> 02:13:14,961 But it does so by copying the value on\n 2338 02:13:14,961 --> 02:13:16,940 Unfortunately, we're not\nquite done yet with C. 2339 02:13:16,940 --> 02:13:19,670 And this is where, again, it\ngets a little annoying at first 2340 02:13:19,671 --> 02:13:23,780 where Scratch just let us express\n 2341 02:13:23,780 --> 02:13:27,110 In C when you have a\nvariable you don't just 2342 02:13:27,110 --> 02:13:29,001 give it a name like you did in Scratch. 2343 02:13:29,001 --> 02:13:33,051 You also have to tell the computer\nin advance what type of value 2344 02:13:34,310 --> 02:13:37,340 String is one such type of value. 2345 02:13:37,341 --> 02:13:40,070 Int, for integer, is\ngoing to be another. 2346 02:13:40,070 --> 02:13:42,860 And there's even more than that\nwe'll see today and beyond. 2347 02:13:42,860 --> 02:13:46,070 And this is partly an answer to the\n 2348 02:13:46,070 --> 02:13:49,970 last week, which was how does a computer\n 2349 02:13:51,320 --> 02:13:55,518 Like is this a letter, a number,\na color, a piece of video. 2350 02:13:55,518 --> 02:13:58,350 And I just claimed last week that\n 2351 02:14:00,871 --> 02:14:04,190 But within those\nprograms, it often depends 2352 02:14:04,190 --> 02:14:08,810 on what the human programmer\nsaid the type of the value is. 2353 02:14:08,810 --> 02:14:10,850 If this specifies that\nthe string, which means 2354 02:14:10,850 --> 02:14:12,890 interpret the following\nzeros and ones that 2355 02:14:12,890 --> 02:14:16,640 are stored in my program as\nwords or letters, more generally. 2356 02:14:16,640 --> 02:14:20,960 If it's an int for integer, it would\n 2357 02:14:20,961 --> 02:14:24,981 treat the following zeros and\nones in my program as a number 2358 02:14:26,791 --> 02:14:29,661 So here's where this\nweek, unlike with Scratch 2359 02:14:29,661 --> 02:14:33,081 which is kind of figures out what you\n 2360 02:14:33,081 --> 02:14:36,201 you have to be this pedantic\nand tell it what you mean. 2361 02:14:36,201 --> 02:14:39,721 There's still one stupid thing\nmissing from my code here. 2362 02:14:42,150 --> 02:14:44,230 DAVID J. MALAN: And we still\nneed the stupid semicolon. 2363 02:14:44,230 --> 02:14:45,648 And I'm sort of impugning it here. 2364 02:14:45,648 --> 02:14:48,060 Because honestly, these are\nthe kinds of stupid mistakes 2365 02:14:48,060 --> 02:14:50,701 you're going to make today,\ntomorrow, this weekend, next week 2366 02:14:50,701 --> 02:14:54,480 a few weeks from now, until you\n 2367 02:14:54,480 --> 02:14:58,440 as well as you do English or\nwhatever your spoken language is. 2368 02:15:00,511 --> 02:15:03,031 Suppose I mix apples and\noranges, so to speak 2369 02:15:03,030 --> 02:15:06,421 and I try to put a string in\nan int or an int in a string 2370 02:15:06,421 --> 02:15:08,581 the compiler is going to complain. 2371 02:15:08,581 --> 02:15:10,841 So when I run that make\ncommand as I did earlier 2372 02:15:10,841 --> 02:15:14,893 it's not going to be nice and blissfully\n 2373 02:15:14,893 --> 02:15:17,100 It's going to yell at me\nwith honestly a very cryptic 2374 02:15:17,100 --> 02:15:20,350 looking error message until we get\n 2375 02:15:23,100 --> 02:15:24,990 Ah, what happened to the backslash n. 2376 02:15:24,990 --> 02:15:26,820 So, we'll come back to that\nin just a moment, if we may. 2377 02:15:26,820 --> 02:15:29,778 Because I have deliberately omitted\n 2378 02:15:29,779 --> 02:15:32,311 And we'll see the different\nbehavior in a sec. 2379 02:15:37,470 --> 02:15:39,720 These are the kinds of\nthings that just matter. 2380 02:15:39,720 --> 02:15:43,710 And it's going to take time to recognize\n 2381 02:15:43,711 --> 02:15:48,306 Everything I've typed here except,\n 2382 02:15:48,305 --> 02:15:50,430 And the W is capitalized\njust because it's English. 2383 02:15:50,430 --> 02:15:51,900 Everything else is lowercase. 2384 02:15:51,900 --> 02:15:54,510 And this kind of varies by\nlanguage and also context. 2385 02:15:54,511 --> 02:15:58,621 So, in many languages the convention\n 2386 02:16:00,060 --> 02:16:02,440 Other languages might use\nsome capitals, as well. 2387 02:16:02,440 --> 02:16:04,050 But we'll talk about that before long. 2388 02:16:04,051 --> 02:16:05,951 But this is the kind\nof thing that matters 2389 02:16:05,951 --> 02:16:09,480 and is hard to see at first, especially\n 2390 02:16:09,480 --> 02:16:12,930 different when it's on your tiny\nlaptop screen from a capital S. 2391 02:16:12,930 --> 02:16:15,850 But you'll start to\ndevelop these instincts. 2392 02:16:15,850 --> 02:16:18,060 All right, so besides\nthis particular block 2393 02:16:18,060 --> 02:16:22,180 let's go ahead and consider how we can\n 2394 02:16:22,180 --> 02:16:24,300 So let me switch back to VS Code here. 2395 02:16:24,301 --> 02:16:26,081 This was the program I had earlier. 2396 02:16:26,081 --> 02:16:28,891 And let me go ahead and\nundo my CS50 change. 2397 02:16:30,720 --> 02:16:35,251 Rerun Make on Hello with the original\n 2398 02:16:35,251 --> 02:16:37,230 Enter, nothing bad\nseems to have happened. 2399 02:16:37,230 --> 02:16:40,740 So dot slash Hello, enter Hello, World. 2400 02:16:40,740 --> 02:16:43,140 Now, if you're curious,\nthis is a good instinct 2401 02:16:43,140 --> 02:16:45,558 to start to acquire what\nhappens if I get rid of this. 2402 02:16:45,558 --> 02:16:47,850 Well, I'm probably not going\nto break things too badly. 2403 02:16:48,781 --> 02:16:51,240 Let me go ahead now and do Make Hello. 2404 02:16:52,331 --> 02:16:53,980 So it's not a really bad mistake. 2405 02:16:53,980 --> 02:16:56,700 So let me go ahead and\nrun dot slash Hello. 2406 02:16:58,921 --> 02:17:01,560 Yeah, what do you see that's different? 2407 02:17:01,560 --> 02:17:05,091 Yeah, the dollar sign, my so-called\n 2408 02:17:05,630 --> 02:17:08,780 Well, we can presumably\ninfer now that the backslash 2409 02:17:08,781 --> 02:17:12,531 n is some fancy notation for\nsaying create a new line 2410 02:17:12,531 --> 02:17:14,881 move the cursor, so to\nspeak, to the next line. 2411 02:17:14,880 --> 02:17:18,770 Notice that the cursor will move to\n 2412 02:17:18,771 --> 02:17:20,781 If I keep hitting it,\nit just automatically 2413 02:17:20,781 --> 02:17:22,371 by nature of hitting enter, does it. 2414 02:17:22,370 --> 02:17:25,310 But it'd be kind of stupid if when\n 2415 02:17:25,310 --> 02:17:28,310 simple as it is, if\nthe next command is now 2416 02:17:28,310 --> 02:17:31,770 weirdly spaced in the middle of\n 2417 02:17:33,111 --> 02:17:35,061 It's really just an aesthetic argument. 2418 02:17:35,060 --> 02:17:40,940 And notice that it's not acceptable or\n 2419 02:17:40,941 --> 02:17:43,441 Let me go ahead and save that,\nthough, and see what happens. 2420 02:17:43,441 --> 02:17:47,001 Let me go ahead now and\nrun Make Hello enter. 2421 02:17:49,161 --> 02:17:53,390 This is like, what, 10 lines of\nerrors for a one line program. 2422 02:17:53,390 --> 02:17:56,390 And this is where, again, you'll start\n 2423 02:17:57,380 --> 02:18:00,530 These kinds of tools, like\nthe compiler tool we're using 2424 02:18:00,531 --> 02:18:03,861 were not designed necessarily\nwith user friendliness in mind. 2425 02:18:03,861 --> 02:18:06,081 That's changed over the\ndecades, but certainly early 2426 02:18:06,081 --> 02:18:09,810 on it's really just meant to be\n 2427 02:18:11,150 --> 02:18:13,941 Missing terminating\nclose quote character 2428 02:18:13,941 --> 02:18:17,061 long story short, when\nyou have a string in C 2429 02:18:17,060 --> 02:18:20,511 your double quotes just have to\n 2430 02:18:20,511 --> 02:18:22,049 Now, there's the slight white lie. 2431 02:18:23,091 --> 02:18:29,631 But the best way around it is to\n 2432 02:18:29,630 --> 02:18:32,390 To escape something means generally\nto put a backslash, and then 2433 02:18:32,390 --> 02:18:34,710 a special symbol like n for new line. 2434 02:18:34,710 --> 02:18:39,510 And this is just the agreed upon way\n 2435 02:18:39,511 --> 02:18:42,650 OK you don't just hit your enter key. 2436 02:18:42,650 --> 02:18:46,941 You instead put backslash n\nand that tells the computer 2437 02:18:46,941 --> 02:18:49,471 to move the cursor to the new line. 2438 02:18:50,977 --> 02:18:52,310 But once you know it, that's it. 2439 02:18:52,310 --> 02:18:55,470 It's just another word\nin our vocabulary. 2440 02:18:55,470 --> 02:18:58,550 So now let me transition to making\n 2441 02:18:58,550 --> 02:19:00,140 Instead of just saying\nHello, world, let me 2442 02:19:00,140 --> 02:19:02,015 change it like last week\nto say Hello, David 2443 02:19:02,015 --> 02:19:04,320 or whoever is interacting\nwith the program. 2444 02:19:04,320 --> 02:19:07,851 So I'm going to do string\nanswer gets, get string 2445 02:19:07,851 --> 02:19:11,256 quote unquote, what's your name. 2446 02:19:11,255 --> 02:19:13,130 I'm not going to bother\nwith a new line here. 2447 02:19:13,730 --> 02:19:15,181 This is now just a judgment call. 2448 02:19:15,181 --> 02:19:17,889 I deliberately want the human to\n 2449 02:19:20,790 --> 02:19:22,970 Well last week recall we used say. 2450 02:19:22,970 --> 02:19:25,251 And then we use the\nother block called join. 2451 02:19:25,251 --> 02:19:27,681 So the idea here is the same. 2452 02:19:27,681 --> 02:19:30,181 But the syntax this week is\ngoing to be a little different. 2453 02:19:30,181 --> 02:19:33,871 It's going to be printf, which\nprints something on the screen. 2454 02:19:33,870 --> 02:19:38,090 I'm going to go ahead\nand say Hello comma. 2455 02:19:38,091 --> 02:19:43,070 And let me just go with this initially\n 2456 02:19:43,070 --> 02:19:46,700 Let me go ahead and recompile my code. 2457 02:19:46,700 --> 02:19:50,292 Whoops, damn doesn't work still. 2458 02:19:50,292 --> 02:19:51,501 And look at all these errors. 2459 02:19:51,501 --> 02:19:54,441 There's more errors than code I wrote. 2460 02:19:56,431 --> 02:19:58,881 Well, this is actually\nsomething, a mistake you'll see 2461 02:19:58,880 --> 02:20:01,130 somewhat often, at least initially. 2462 02:20:01,130 --> 02:20:03,270 And let's start to glean\nwhat's going on here. 2463 02:20:03,271 --> 02:20:06,921 So here, if I look at the very first\n 2464 02:20:06,921 --> 02:20:09,921 so even though it jumped\ndown the screen pretty fast 2465 02:20:09,921 --> 02:20:12,831 I wrote Make Hello at\nthe dollar sign, prompt. 2466 02:20:12,831 --> 02:20:14,361 And then here's the first error. 2467 02:20:17,120 --> 02:20:21,020 technically character 5, but generally\n 2468 02:20:21,021 --> 02:20:24,951 there's an error, use of\nundeclared identifier string. 2469 02:20:28,101 --> 02:20:30,361 And this is not an\nobvious solution at first. 2470 02:20:30,361 --> 02:20:34,161 But you'll start to recognize\nthese patterns in error messages. 2471 02:20:34,161 --> 02:20:39,740 It turns out that if I want to use\n 2472 02:20:39,740 --> 02:20:43,879 I have to include another library\nup here, another line of code 2473 02:20:43,879 --> 02:20:46,460 rather, called CS50\ndot H. We'll come back 2474 02:20:46,460 --> 02:20:48,300 to what this means in just a moment. 2475 02:20:48,300 --> 02:20:54,560 But if I now retroactively say,\n 2476 02:20:55,670 --> 02:20:59,720 Before I added that new line,\nwhat is standard I/O doing? 2477 02:20:59,720 --> 02:21:01,670 Well, if you think\nback to Scratch, there 2478 02:21:01,670 --> 02:21:07,761 were a few examples with the camera and\n 2479 02:21:07,761 --> 02:21:10,281 Remember I had to poke around\nin the extensions button. 2480 02:21:10,281 --> 02:21:12,110 And then I had to load it into Scratch. 2481 02:21:12,110 --> 02:21:14,240 It didn't come natively with Scratch. 2482 02:21:15,920 --> 02:21:18,170 Some functions come with the language. 2483 02:21:18,170 --> 02:21:22,161 But for the most part, if you want to\n 2484 02:21:22,161 --> 02:21:26,570 like printf, you have to load\nthat extension, so to speak 2485 02:21:26,570 --> 02:21:29,361 that more traditionally\nis called a library. 2486 02:21:29,361 --> 02:21:35,121 So there is a standard I/O\nlibrary, STD I/O, standard I/O 2487 02:21:35,120 --> 02:21:37,350 where I/O just means input and output. 2488 02:21:37,351 --> 02:21:39,441 Which means, just like\nin MIT's World, there 2489 02:21:39,441 --> 02:21:43,161 was an extension for doing text\n 2490 02:21:43,161 --> 02:21:45,470 In C, there's an extension, a.k.a. 2491 02:21:45,470 --> 02:21:49,230 a library, for doing\nstandard input and output. 2492 02:21:49,230 --> 02:21:53,450 And so if you want to use any functions\n 2493 02:21:53,450 --> 02:21:58,130 like text from a keyboard, you\nhave to include standard I/O dot 2494 02:21:58,130 --> 02:22:02,360 H. And then can you use printf. 2495 02:22:03,890 --> 02:22:08,690 Get string, it turns out, is a\n 2496 02:22:08,691 --> 02:22:10,971 And as we'll see over\nthe coming weeks, it just 2497 02:22:10,970 --> 02:22:14,780 makes it way easier to\nget input from a user. 2498 02:22:14,781 --> 02:22:18,111 C is very good with printf at\nprinting output on the screen. 2499 02:22:18,111 --> 02:22:21,411 C makes it really annoying and\n 2500 02:22:21,411 --> 02:22:23,191 to just get input from the user. 2501 02:22:23,191 --> 02:22:25,820 So we wrote a function\ncalled get_string 2502 02:22:25,820 --> 02:22:29,240 but the only way you can use\nthat is to load the extension 2503 02:22:29,240 --> 02:22:32,271 a.k.a. load the library called CS50. 2504 02:22:32,271 --> 02:22:35,841 And we'll come back in time, like,\n 2505 02:22:35,841 --> 02:22:39,291 But for now, standard\nI/O is a library that 2506 02:22:39,290 --> 02:22:42,540 gives you access to printf and\ninput- and output-related stuff. 2507 02:22:42,540 --> 02:22:45,110 CS50 is a second library\nthat provides you 2508 02:22:45,111 --> 02:22:48,441 with access to functions\nthat don't come with C 2509 02:22:48,441 --> 02:22:51,501 that include something like get_string. 2510 02:22:51,501 --> 02:22:55,191 So with that said, we've\nnow kind of teased apart 2511 02:22:55,191 --> 02:22:58,171 at a high level what lines\n2 and now 1 are doing. 2512 02:22:58,171 --> 02:23:00,380 Let me go ahead and rerun make hello. 2513 02:23:01,611 --> 02:23:05,099 So all those crazy error messages\nwere resolved by just one fix 2514 02:23:05,099 --> 02:23:07,640 so key takeaway is not to get\noverwhelmed by the sheer number 2515 02:23:08,581 --> 02:23:13,521 Let me now do ./hello and if I type\n 2516 02:23:19,470 --> 02:23:24,091 Yeah, hello answer, because the\n 2517 02:23:24,091 --> 02:23:26,671 And it turns out that if\nyou just write "hello 2518 02:23:26,671 --> 02:23:29,611 answer" all in the double quotes,\nyou\'re really just passing 2519 02:23:29,611 --> 02:23:32,551 English as the input\nto the printf function 2520 02:23:32,550 --> 02:23:34,470 you're not actually\npassing in the variable. 2521 02:23:34,470 --> 02:23:37,320 And unfortunately in\nC, it's not quite as 2522 02:23:37,320 --> 02:23:40,861 easy to plug things in to\nother things that you've typed. 2523 02:23:40,861 --> 02:23:43,111 Remember in Scratch, there\nwas not just the Save block 2524 02:23:43,111 --> 02:23:45,661 but the Join block,\nwhich was kind of pretty 2525 02:23:45,661 --> 02:23:47,460 you can combine apples and oranges-- 2526 02:23:48,900 --> 02:23:52,640 Then we changed it to hello and then\n 2527 02:23:52,640 --> 02:23:54,765 In C, the syntax is going\nto be a little different. 2528 02:23:54,765 --> 02:23:59,850 You tell the computer inside of your\n 2529 02:23:59,851 --> 02:24:05,191 a placeholder there, a so-called\n 2530 02:24:05,191 --> 02:24:08,041 put a string here eventually. 2531 02:24:08,040 --> 02:24:12,661 Then outside of your quotes, you\n 2532 02:24:12,661 --> 02:24:18,421 in whatever variable you want the\n 2533 02:24:19,230 --> 02:24:23,671 So %s is a format code which\nserves as a placeholder. 2534 02:24:23,671 --> 02:24:26,431 And now the printf function\nwas designed by humans years 2535 02:24:26,431 --> 02:24:28,650 ago to figure out how to\ndo the apple and banana 2536 02:24:28,650 --> 02:24:30,972 thing of joining two words together. 2537 02:24:30,972 --> 02:24:33,181 It's not nearly as user-friendly\nas it is in Scratch 2538 02:24:33,181 --> 02:24:35,701 but it's a very common paradigm. 2539 02:24:35,700 --> 02:24:38,880 So let me try and rerun\nthis now. make hello. 2540 02:24:42,630 --> 02:24:45,780 If I type Enter now, now it's hello. 2541 02:24:46,351 --> 02:24:48,810 And the printf, here's the F in printf. 2542 02:24:48,810 --> 02:24:53,581 It formats its input for you by using\n 2543 02:24:53,581 --> 02:24:58,201 strings, represented again by %s. 2544 02:24:58,200 --> 02:25:02,650 So a quick question then, if I focus\n 2545 02:25:02,650 --> 02:25:09,480 and even zoom in here, how many\n 2546 02:25:09,480 --> 02:25:13,501 A moment ago, I'll admit that it was\n 2547 02:25:14,790 --> 02:25:19,980 How many inputs might you\ninfer printf is taking now? 2548 02:25:20,521 --> 02:25:25,201 And it's implied by this comma here,\n 2549 02:25:25,200 --> 02:25:29,310 quote, unquote, "hello, %s"\nfrom the second one, answer. 2550 02:25:29,310 --> 02:25:32,730 And then just as a quick safety\ncheck here, why is it not 3? 2551 02:25:32,730 --> 02:25:35,320 Because there's obviously\ntwo commas here. 2552 02:25:35,320 --> 02:25:38,280 Why is it not actually\n3 arguments or inputs? 2553 02:25:43,001 --> 02:25:46,210 The comma to the left is actually\npart of my English grammar 2554 02:25:47,693 --> 02:25:50,110 And, again, here's where\nprogramming can just be confusing 2555 02:25:50,111 --> 02:25:52,901 early on because we're using the\n 2556 02:25:52,900 --> 02:25:55,730 different things, it just\ndepends on the context. 2557 02:25:55,730 --> 02:25:57,970 And so now is actually\na good time to point out 2558 02:25:57,970 --> 02:26:01,751 all of the somewhat pretty colors that\n 2559 02:26:02,290 --> 02:26:06,040 even though I wasn't going to a format\n 2560 02:26:06,040 --> 02:26:08,990 I certainly wasn't changing\nthings to red or blue or whatnot-- 2561 02:26:08,990 --> 02:26:13,900 that's because a text editor like\n 2562 02:26:13,900 --> 02:26:17,050 This is a feature of so many different\n 2563 02:26:18,640 --> 02:26:23,020 If your text editor understands the\n 2564 02:26:24,320 --> 02:26:28,700 it highlights in different colors the\n 2565 02:26:28,700 --> 02:26:31,870 So, for instance, string and\nanswer here are in black 2566 02:26:31,870 --> 02:26:35,827 but get_string a function is in\nthis sort of nasty brown-yellow 2567 02:26:35,827 --> 02:26:38,411 here right now, but that's just\nhow it displays on the screen. 2568 02:26:38,411 --> 02:26:41,111 The string, though, here in red\nis kind of jumping out at me 2569 02:26:41,111 --> 02:26:42,791 and that's marginally useful. 2570 02:26:44,113 --> 02:26:46,280 That's kind of nice, because\nit's jumping out at me. 2571 02:26:46,281 --> 02:26:49,601 And so it's just using different colors\n 2572 02:26:49,601 --> 02:26:53,021 pop so you can focus on\nhow these ideas interrelate 2573 02:26:53,021 --> 02:26:55,121 and, honestly, when you\nmight make a mistake. 2574 02:26:55,120 --> 02:26:58,030 For instance, let me accidentally\nleave off this quote here. 2575 02:26:58,031 --> 02:27:03,011 And now all of a sudden,\nnotice if I delete the quote 2576 02:27:03,011 --> 02:27:05,711 the colors start to get a little awry. 2577 02:27:05,710 --> 02:27:09,640 But if I go back there and put it\n 2578 02:27:09,640 --> 02:27:11,470 What's another feature\nof this text editor? 2579 02:27:11,470 --> 02:27:15,251 Notice when my cursor is next\nto this parenthesis, which 2580 02:27:15,251 --> 02:27:18,111 demarcates the end of the\ninputs to the function 2581 02:27:18,111 --> 02:27:21,611 notice that highlighted in green\n 2582 02:27:22,150 --> 02:27:24,237 It's just a visually useful\nthing, especially when 2583 02:27:24,237 --> 02:27:26,320 you start writing more and\nmore code, just to make 2584 02:27:26,320 --> 02:27:28,240 sure your parentheses are lining up. 2585 02:27:28,240 --> 02:27:31,390 And that's true for these curly braces\n 2586 02:27:31,390 --> 02:27:33,011 We'll come back to those in a moment. 2587 02:27:33,011 --> 02:27:36,820 If I put my cursor there, you can\n 2588 02:27:37,700 --> 02:27:40,600 So it's nothing in your code\nfundamentally, it's just the editor 2589 02:27:40,601 --> 02:27:42,911 trying to help you, the human, program. 2590 02:27:42,911 --> 02:27:45,251 And you can even see it,\nthough it's a little subtle-- 2591 02:27:45,251 --> 02:27:48,130 see these four dots here\nand these four dots here? 2592 02:27:49,691 --> 02:27:53,531 I configured VS Code to\nindent by four spaces, which 2593 02:27:54,761 --> 02:27:58,091 Any time I hit the Tab key, this\ntoo can help you make sure-- 2594 02:27:58,091 --> 02:28:00,941 once we have more interesting\nand longer programs-- 2595 02:28:00,941 --> 02:28:04,121 that everything lines\nup nice and neatly. 2596 02:28:05,380 --> 02:28:07,390 All right, any questions\nthen on printf or more? 2597 02:28:08,653 --> 02:28:11,122 AUDIENCE: [? Would ?]\nthe printf [INAUDIBLE]?? 2598 02:28:11,122 --> 02:28:12,831 DAVID J. MALAN: Short\nanswer, yes. printf 2599 02:28:12,831 --> 02:28:16,130 can handle more than one\ntype of variable or value. 2600 02:28:17,240 --> 02:28:20,671 We're going to see %i is another\nfor plugging in an integer. 2601 02:28:20,671 --> 02:28:24,261 You can have multiple i's, multiple\n 2602 02:28:24,261 --> 02:28:26,150 We'll come back to that\nin just a little bit. 2603 02:28:26,150 --> 02:28:29,841 printf can take many more\narguments than just these two. 2604 02:28:29,841 --> 02:28:32,150 This is just meant to be representative. 2605 02:28:34,191 --> 02:28:36,201 Can you declare variables\nwithin the printf? 2606 02:28:37,611 --> 02:28:39,651 The only variable I'm\nusing right now is answer 2607 02:28:39,650 --> 02:28:43,251 and it's got to be done outside\n 2608 02:28:43,251 --> 02:28:46,310 Good question, we'll see\nmore of that before long. 2609 02:28:49,744 --> 02:28:51,911 DAVID J. MALAN: How do we\ndownload the CS50 library? 2610 02:28:51,911 --> 02:28:55,091 So we will show you in problems\nset 1 exactly how to do that. 2611 02:28:55,091 --> 02:28:58,421 It's automatically done for you in\n 2612 02:28:58,421 --> 02:29:01,810 If, ultimately, you program on your own\n 2613 02:29:01,810 --> 02:29:03,700 on, it's also installable online. 2614 02:29:03,700 --> 02:29:07,073 But if you want to ask that\nvia online or afterward 2615 02:29:07,074 --> 02:29:08,740 we can point you in the right direction. 2616 02:29:13,341 --> 02:29:16,970 DAVID J. MALAN: String is the type\n 2617 02:29:16,970 --> 02:29:19,100 the data type of the variable. 2618 02:29:19,101 --> 02:29:22,281 int is another keyword I alluded\n 2619 02:29:22,281 --> 02:29:26,191 int, for integer, is going to be\n 2620 02:29:26,191 --> 02:29:27,441 AUDIENCE: OK. [? Thank you. ?] 2621 02:29:29,691 --> 02:29:31,108 DAVID J. MALAN: Oh, good question. 2622 02:29:31,108 --> 02:29:34,970 Could I go ahead and just\nplug in this function 2623 02:29:34,970 --> 02:29:39,261 kind of like we did in Scratch,\n 2624 02:29:39,261 --> 02:29:42,470 and just do this, which\nrecall, is reminiscent of what 2625 02:29:42,470 --> 02:29:45,921 I did in Scratch by plopping\nblock on top of block on block? 2626 02:29:48,271 --> 02:29:50,230 Can I put string in front of get_string? 2627 02:29:50,730 --> 02:29:53,700 You only put the word string\nin front of a variable 2628 02:29:55,200 --> 02:29:57,870 And even though I'm apparently\nanswering the wrong question 2629 02:29:57,870 --> 02:30:01,710 let me go ahead and zoom out,\nsave this, do make hello again. 2630 02:30:03,450 --> 02:30:06,180 If I run ./hello, type in David, voila. 2631 02:30:07,450 --> 02:30:10,390 And so, actually, let's go down\n 2632 02:30:10,390 --> 02:30:12,210 Clearly, it's still correct-- 2633 02:30:12,210 --> 02:30:14,610 at least, based on my limited testing. 2634 02:30:14,611 --> 02:30:18,001 Is this better designed\nor worse designed? 2635 02:30:18,001 --> 02:30:20,130 Let's open that question\nlike we did last week. 2636 02:30:21,480 --> 02:30:23,820 Yeah, I kind of agree with that. 2637 02:30:23,820 --> 02:30:26,011 Reasonable people could\ndisagree, but I do 2638 02:30:26,011 --> 02:30:29,970 agree that this seems harder to\n 2639 02:30:29,970 --> 02:30:32,438 but wait a minute. get_string\nis going to get used first 2640 02:30:32,438 --> 02:30:34,271 and then it's going to\ngive me back a value. 2641 02:30:34,271 --> 02:30:37,501 So, yeah, it just feels like it\nwas nicer to read top to bottom 2642 02:30:40,615 --> 02:30:44,611 And so this is useful if I only want\n 2643 02:30:44,611 --> 02:30:47,774 If I want to use it later in a\nlonger program, I'm out of luck 2644 02:30:47,773 --> 02:30:49,440 and so I haven't saved it in a variable. 2645 02:30:49,441 --> 02:30:53,041 So I think, long story short, we\ncould debate this all day long. 2646 02:30:53,040 --> 02:30:56,150 But in this case, eh, if you can\nmake a reasonable argument one 2647 02:30:56,150 --> 02:30:59,240 way or the other, that's a\npretty solid ground to stand on. 2648 02:30:59,240 --> 02:31:01,101 But, invariably,\nreasonable people are going 2649 02:31:01,101 --> 02:31:05,641 to disagree, whether first-time\n 2650 02:31:05,640 --> 02:31:09,740 So let's frame this one last example\n 2651 02:31:09,740 --> 02:31:11,380 of taking inputs and outputs. 2652 02:31:11,380 --> 02:31:13,130 The functions we've\nbeen talking about all 2653 02:31:13,130 --> 02:31:17,720 take inputs, otherwise now known as\n 2654 02:31:18,531 --> 02:31:21,740 That's just the fancy word\nfor an input to a function. 2655 02:31:21,740 --> 02:31:25,046 And some functions have either\nside effects, like we saw-- 2656 02:31:25,046 --> 02:31:27,171 printing something, saying\nsomething on the screen 2657 02:31:27,171 --> 02:31:28,911 sort of visually or audibly-- 2658 02:31:28,911 --> 02:31:33,921 or they return a value, which is a\n 2659 02:31:35,251 --> 02:31:39,261 If we look then at what we did last\n 2660 02:31:39,261 --> 02:31:41,810 the input was what's your\nname, the function was ask 2661 02:31:41,810 --> 02:31:44,780 and the return value was answer. 2662 02:31:44,781 --> 02:31:49,521 And now let's take a look at this block,\n 2663 02:31:49,521 --> 02:31:51,266 version of what we just did with the %s. 2664 02:31:51,266 --> 02:31:54,921 Last week we said save, then\njoin, then hello and answer. 2665 02:31:54,921 --> 02:31:58,581 But the interesting takeaway there\n 2666 02:31:58,581 --> 02:32:03,021 It was the fact that in Scratch\n2, the output of one function 2667 02:32:03,021 --> 02:32:08,150 like the green join, could become\nthe input to another function 2668 02:32:09,501 --> 02:32:12,470 The syntax in C is\nadmittedly pretty different 2669 02:32:12,470 --> 02:32:14,480 but the idea is essentially the same. 2670 02:32:14,480 --> 02:32:18,380 Here, though, we have\nhello, a placeholder 2671 02:32:18,380 --> 02:32:21,890 but we have to, in this\nworld of C, tell printf 2672 02:32:21,890 --> 02:32:25,110 what we want to plug in\nfor that placeholder. 2673 02:32:25,980 --> 02:32:27,147 But that's the way to do it. 2674 02:32:27,147 --> 02:32:29,690 When we get to Python and other\nlanguages later in the term 2675 02:32:29,691 --> 02:32:31,521 there's actually easier ways to do this. 2676 02:32:31,521 --> 02:32:34,311 But this is a very common\nparadigm, particularly when 2677 02:32:34,310 --> 02:32:37,490 you want to format\nyour data in some way. 2678 02:32:37,490 --> 02:32:40,050 All right, let's then take a\nstep back to where we began 2679 02:32:40,050 --> 02:32:43,700 which was with that whole\nprogram, which had the include 2680 02:32:43,700 --> 02:32:47,630 and it had int main(void) and\nall of this other cryptic syntax. 2681 02:32:47,630 --> 02:32:51,350 This Scratch piece last week\nwas kind of like the go-to 2682 02:32:51,351 --> 02:32:53,869 whenever you want to have a\nmain part of your program. 2683 02:32:53,869 --> 02:32:55,911 It's not the only way to\nstart a Scratch program. 2684 02:32:55,911 --> 02:32:59,181 You could listen for clicks or other\n 2685 02:32:59,181 --> 02:33:03,831 But this was probably the most popular\n 2686 02:33:03,831 --> 02:33:07,478 In C, the closest analog is\nto literally write this out. 2687 02:33:07,478 --> 02:33:10,520 So just like last week, if you were\n 2688 02:33:10,521 --> 02:33:13,101 when green flag clicked,\nas a C programmer 2689 02:33:13,101 --> 02:33:15,771 the first thing you would do is\nafter creating an empty file 2690 02:33:15,771 --> 02:33:18,621 like I did with hello.c,\nyou'd probably type int 2691 02:33:18,620 --> 02:33:22,040 main(void) open curly\nbrace, closed curly brace 2692 02:33:22,040 --> 02:33:26,100 and then you can put all of your\n 2693 02:33:26,101 --> 02:33:29,211 So just like Scratch had\nthis sort of magnetic nature 2694 02:33:29,210 --> 02:33:33,530 to it where the puzzle pieces would snap\n 2695 02:33:33,531 --> 02:33:38,240 tends to use these curly braces, one\n 2696 02:33:38,240 --> 02:33:41,480 And anything inside of\nthose braces, so to speak 2697 02:33:41,480 --> 02:33:44,570 is part of this puzzle piece, a.k.a. 2698 02:33:47,450 --> 02:33:50,840 We went down this rabbit hole moment ago\n 2699 02:33:50,841 --> 02:33:52,674 even though I didn't\ncall them by this name. 2700 02:33:52,674 --> 02:33:56,856 But, indeed, when we have a whole\n 2701 02:33:56,855 --> 02:33:59,480 Just have the one green flag\nclicked and then say hello, world. 2702 02:34:00,681 --> 02:34:03,351 After all, it's meant to be very\nuser-friendly and graphical. 2703 02:34:03,351 --> 02:34:09,531 In C, though, you technically can't just\n 2704 02:34:10,700 --> 02:34:16,340 Because, again, you need to tell\n 2705 02:34:16,341 --> 02:34:22,041 code that someone else wrote-- so that\n 2706 02:34:22,040 --> 02:34:24,980 You have to load the\nCS50 library whenever 2707 02:34:24,980 --> 02:34:28,280 you want to use get_string or\nother functions, like get_int 2708 02:34:29,630 --> 02:34:32,690 Otherwise, the compiler won't\nknow what get_string is. 2709 02:34:32,691 --> 02:34:35,011 You just have to do it this way. 2710 02:34:35,011 --> 02:34:37,341 The specific file name\nI'm mentioning here 2711 02:34:37,341 --> 02:34:44,310 stdio.h, cs50.h, is what C programmers\n 2712 02:34:44,310 --> 02:34:46,831 We'll see eventually what's\ninside of those files. 2713 02:34:46,831 --> 02:34:51,060 But long story short, it's like a menu\n 2714 02:34:51,060 --> 02:34:54,560 So in cs50.h, there's a menu\nmentioning get_string, get_int 2715 02:34:56,150 --> 02:35:01,790 And in stdio.h, there's a menu of\n 2716 02:35:01,790 --> 02:35:04,640 And that menu is what\nprepares the compiler 2717 02:35:04,640 --> 02:35:08,945 to know how to implement\nthose same functions. 2718 02:35:08,945 --> 02:35:10,310 All right, let me pause here. 2719 02:35:16,960 --> 02:35:20,560 A library provides all of the\nfunctionality we're talking about. 2720 02:35:20,560 --> 02:35:25,720 A header file is the very specific\n 2721 02:35:25,720 --> 02:35:27,770 And we'll discuss this more next week. 2722 02:35:27,771 --> 02:35:30,161 For now, they're essentially\nthe same, but we'll discuss 2723 02:35:30,161 --> 02:35:32,800 nuances between the two next week. 2724 02:35:32,800 --> 02:35:36,460 Yeah, the library would be standard\nI/O. The library would CS50. 2725 02:35:36,460 --> 02:35:41,720 The corresponding header\nfile is stdio.h, cs50.h. 2726 02:35:50,480 --> 02:35:54,230 incredibly common in the world of\n 2727 02:35:54,230 --> 02:35:59,300 but in C, there's technically no\n 2728 02:35:59,300 --> 02:36:02,216 We have sort of conjured it up\nto simplify the first few weeks. 2729 02:36:02,216 --> 02:36:05,091 That's a training wheel that we'll\n 2730 02:36:05,091 --> 02:36:09,411 take away, and we'll see why we've\n 2731 02:36:09,411 --> 02:36:14,161 Because C otherwise makes things\n 2732 02:36:14,161 --> 02:36:16,191 which then gets besides\nthe point for us. 2733 02:36:20,570 --> 02:36:23,480 Early on, you will have to use whatever\n 2734 02:36:23,480 --> 02:36:24,950 That will include CS50's functions. 2735 02:36:24,950 --> 02:36:28,190 Long story short, you referred, I\n 2736 02:36:28,191 --> 02:36:30,981 called scanf, we won't\ntalk about for a few weeks. 2737 02:36:30,980 --> 02:36:36,710 Long story short, in C, it's pretty easy\n 2738 02:36:36,710 --> 02:36:40,310 The catch is that it's really\neasy to do it dangerously. 2739 02:36:40,310 --> 02:36:45,110 And C, because it's an older,\nlower-level language, so to speak 2740 02:36:45,111 --> 02:36:49,431 that gives you pretty much ultimate\n 2741 02:36:49,431 --> 02:36:51,621 It's very easy to make mistakes. 2742 02:36:51,620 --> 02:36:55,100 And, indeed, that's too\nwhy we use the library 2743 02:36:55,101 --> 02:36:58,541 so your code won't crash unintendedly. 2744 02:36:58,540 --> 02:37:01,620 All right, so with this in\nmind, we have this now mapping 2745 02:37:01,620 --> 02:37:03,370 between the Scratch\nversion and the other. 2746 02:37:03,370 --> 02:37:06,537 Let me just give you a quick tour of\n 2747 02:37:06,538 --> 02:37:10,240 types that students will start seeing as\n 2748 02:37:10,240 --> 02:37:13,511 In the world of Linux, here\nis a non-exhaustive list 2749 02:37:13,511 --> 02:37:16,810 of commands with which you'll get\n 2750 02:37:16,810 --> 02:37:18,040 by playing with problem sets. 2751 02:37:18,040 --> 02:37:22,783 We've only seen two of these so\nfar, ls for list, rm for others. 2752 02:37:22,783 --> 02:37:24,700 But I mention them now\njust so that it doesn't 2753 02:37:24,700 --> 02:37:30,340 feel too foreign when you see them\n 2754 02:37:30,341 --> 02:37:32,411 cp is going to stand for copy. 2755 02:37:32,411 --> 02:37:35,710 mkdir is going to stand\nfor make directory. mv is 2756 02:37:35,710 --> 02:37:38,470 going to stand for move or rename. 2757 02:37:38,470 --> 02:37:44,470 rmdir is going to be remove directory,\n 2758 02:37:44,470 --> 02:37:46,960 and let me show you this\nlast one here first 2759 02:37:46,960 --> 02:37:49,510 only because it's something\nyou'll use so commonly. 2760 02:37:49,511 --> 02:37:53,681 If I go back to my code here on\n 2761 02:37:53,681 --> 02:37:58,091 and re-open the little GUI on the\n 2762 02:37:58,091 --> 02:38:01,451 revealing that I've got two\nfiles, hello and hello.c 2763 02:38:01,450 --> 02:38:02,950 so nothing has changed since there. 2764 02:38:02,950 --> 02:38:05,800 Suppose now that it's\na few weeks into class 2765 02:38:05,800 --> 02:38:07,661 and I want to start\norganizing the code I'm 2766 02:38:07,661 --> 02:38:10,511 writing so that I have a folder\nfor this week or next week 2767 02:38:10,511 --> 02:38:13,421 or maybe a folder for\nproblem set 1, problem set 2. 2768 02:38:15,130 --> 02:38:18,190 In the GUI, I can go up\nhere and do what most of you 2769 02:38:18,191 --> 02:38:19,961 would do instinctively on a Mac or PC. 2770 02:38:19,960 --> 02:38:22,450 You look for like a\nfolder icon, you click it 2771 02:38:22,450 --> 02:38:25,960 and then you name a\nfolder like PSet1, Enter. 2772 02:38:25,960 --> 02:38:28,690 Voila, you've got a folder called PSet1. 2773 02:38:28,691 --> 02:38:34,701 I can confirm as much with my command\n 2774 02:38:34,700 --> 02:38:36,890 How can I list what's in my folder? 2775 02:38:39,504 --> 02:38:41,421 and it's green with an\nasterisk because that's 2776 02:38:41,421 --> 02:38:43,581 my executable, my runnable program-- 2777 02:38:43,581 --> 02:38:46,701 hello.c, which is my source\ncode, and now PSet1 with a slash 2778 02:38:46,700 --> 02:38:50,030 at the end, which just implies\nthat it's indeed a folder. 2779 02:38:50,031 --> 02:38:52,321 All right, I didn't really\nwant to do it that way. 2780 02:38:52,320 --> 02:38:53,870 I'd like to do it more advanced. 2781 02:38:53,870 --> 02:38:57,530 So let me go ahead and right-click\non PSet1, delete permanently. 2782 02:38:57,531 --> 02:38:59,711 I get a scary irreversible\nerror message. 2783 02:38:59,710 --> 02:39:01,460 But there's nothing\nin it, so that's fine. 2784 02:39:01,460 --> 02:39:03,560 Now I've deleted it using the GUI. 2785 02:39:03,560 --> 02:39:08,851 But now let me go ahead and start doing\n 2786 02:39:08,851 --> 02:39:11,060 And if you're wondering how\nthings keep disappearing 2787 02:39:11,060 --> 02:39:15,290 if you hit Control-L in your terminal\n 2788 02:39:15,290 --> 02:39:18,470 it will delete everything you previously\n 2789 02:39:18,470 --> 02:39:20,595 In practice, you don't need\nto be doing this often. 2790 02:39:20,595 --> 02:39:23,480 I'm doing it just to keep our\nfocus on my latest commands. 2791 02:39:23,480 --> 02:39:26,962 If I do-- what was the command\nto make a new directory? 2792 02:39:28,320 --> 02:39:30,540 DAVID J. MALAN: Yeah, so\nmkdir, make directory. 2793 02:39:32,251 --> 02:39:34,621 And notice at left, there's my PSet1. 2794 02:39:34,620 --> 02:39:37,210 If I want to get a little\noverzealous, plan for next week 2795 02:39:39,120 --> 02:39:44,722 Suppose now I want to open those\n 2796 02:39:44,722 --> 02:39:46,681 I could double-click on\nit like this, and you'd 2797 02:39:46,681 --> 02:39:48,408 see this little arrow is moving. 2798 02:39:48,407 --> 02:39:51,490 It's not doing anything because there's\n 2799 02:39:51,490 --> 02:39:55,050 But suppose again I want to get more\n 2800 02:39:55,050 --> 02:39:59,130 Notice if I type ls now, I\nsee all four same things. 2801 02:39:59,130 --> 02:40:05,820 Let me change directories\nwith cd space PSet1 Enter. 2802 02:40:05,820 --> 02:40:08,460 And now notice two things\nwill have happened. 2803 02:40:08,460 --> 02:40:14,070 One, my prompt has changed\nslightly to remind me where I am 2804 02:40:14,070 --> 02:40:17,730 just to keep me sane so that I don't\n 2805 02:40:17,730 --> 02:40:21,610 So here is just a visual reminder\n 2806 02:40:21,611 --> 02:40:26,911 If I type ls now, what should\nI see after hitting Enter? 2807 02:40:26,911 --> 02:40:29,591 Nothing, because I've only\ncreated empty folders so far. 2808 02:40:31,050 --> 02:40:35,490 If I wanted to create a folder called\n 2809 02:40:35,490 --> 02:40:37,890 called Mario this week, I can do that. 2810 02:40:37,890 --> 02:40:40,530 Now if I type ls, there is Mario. 2811 02:40:40,531 --> 02:40:42,871 Now if I do cd Mario,\nnotice my prompt's going 2812 02:40:42,870 --> 02:40:44,520 to change to be a little more precise. 2813 02:40:47,191 --> 02:40:49,051 And notice what's happening at top left. 2814 02:40:49,050 --> 02:40:51,150 Nothing now, because these\nfolders are collapsed. 2815 02:40:51,150 --> 02:40:54,390 But if I click the little\ntriangle, there I see Mario. 2816 02:40:54,390 --> 02:40:56,911 Nothing's going on in there\nbecause there's no files yet. 2817 02:40:56,911 --> 02:41:00,810 But suppose now I want to\ncreate a file called mario.c. 2818 02:41:00,810 --> 02:41:04,770 I could go up here, I could click the\n 2819 02:41:04,771 --> 02:41:07,591 Or I can just type code mario.c. 2820 02:41:08,220 --> 02:41:10,082 That creates a new tab for me. 2821 02:41:10,083 --> 02:41:13,291 I'm not going to write any code in here\n 2822 02:41:13,290 --> 02:41:16,950 And now at top left, you'll\nsee that mario.c appears. 2823 02:41:16,950 --> 02:41:19,492 So at some point, you can\neventually just close the Explorer. 2824 02:41:19,493 --> 02:41:22,158 Because, again, it's not providing\nyou with any new information. 2825 02:41:22,158 --> 02:41:23,970 It's maybe more\nuser-friendly, but there's 2826 02:41:23,970 --> 02:41:27,960 nothing you can't do at the command\n 2827 02:41:27,960 --> 02:41:29,880 All right, but now I'm kind of stuck. 2828 02:41:29,880 --> 02:41:32,220 How do I get out of this folder? 2829 02:41:32,220 --> 02:41:34,020 In my Mac or PC world,\nI'd probably click 2830 02:41:34,021 --> 02:41:37,320 the Back button or something like that\n 2831 02:41:37,320 --> 02:41:42,181 In the terminal window,\nI can do cd dot dot. 2832 02:41:42,181 --> 02:41:46,650 Dot dot is a nickname, if you\nwill, for the parent directory. 2833 02:41:46,650 --> 02:41:48,101 That is, the previous directory. 2834 02:41:48,101 --> 02:41:52,601 So if I hit Enter now, notice I'm\n 2835 02:41:52,601 --> 02:41:56,131 a.k.a. directory, and\nnow I'm back in PSet1. 2836 02:41:56,130 --> 02:42:00,240 Or, if I want to be fancy, let me\n 2837 02:42:00,240 --> 02:42:03,300 If I type ls, there's\nmario.c, just to orient us. 2838 02:42:03,300 --> 02:42:06,458 If I want to do multiple things\nat a time, I could do cd../.. 2839 02:42:09,271 --> 02:42:13,021 which goes to my parent to my\ngrandparent all in one breath. 2840 02:42:13,021 --> 02:42:16,541 And voila, now I'm back in my\ndefault folder, if you will. 2841 02:42:16,540 --> 02:42:20,911 And one last little trick of the\n 2842 02:42:20,911 --> 02:42:24,181 was a moment ago, and you're\njust tired of all the navigation 2843 02:42:24,181 --> 02:42:26,881 if you just type cd and\nhit Enter, it'll whisk you 2844 02:42:26,880 --> 02:42:29,070 away back to your default\nfolder, and you don't have 2845 02:42:29,070 --> 02:42:31,320 to worry about getting there manually. 2846 02:42:31,320 --> 02:42:38,521 Recall a bit ago, though, that I\n 2847 02:42:38,521 --> 02:42:42,751 If dot refers to my parent,\nperhaps infer here syntactically 2848 02:42:42,751 --> 02:42:46,720 what does a single dot mean instead? 2849 02:42:46,720 --> 02:42:49,480 It means this directory,\nyour current directory. 2850 02:42:50,900 --> 02:42:52,993 It just makes super\nexplicit to the computer 2851 02:42:52,993 --> 02:42:55,451 that I want the program called\nhello that's installed here 2852 02:42:55,450 --> 02:42:59,500 not in some random other folder\non my hard drive, so to speak. 2853 02:42:59,501 --> 02:43:02,425 I want the one that's\nright here instead. 2854 02:43:02,425 --> 02:43:04,300 All right, so besides\nthese commands, there's 2855 02:43:04,300 --> 02:43:06,452 going to be others that\nwe encounter over time. 2856 02:43:06,452 --> 02:43:07,661 Those are kind of the basics. 2857 02:43:07,661 --> 02:43:11,710 That allows you to wean yourself off\n 2858 02:43:11,710 --> 02:43:14,470 and start using more comfortably,\nwith practice and time 2859 02:43:14,470 --> 02:43:16,360 a command line interface instead. 2860 02:43:16,361 --> 02:43:19,631 Well, what about those other\ntypes, now back in the world of C? 2861 02:43:19,630 --> 02:43:23,950 Those commands were not C. Those are\n 2862 02:43:23,950 --> 02:43:28,300 interface, like in Linux, which,\n 2863 02:43:28,300 --> 02:43:30,310 It's an alternative\nto Mac OS and Windows. 2864 02:43:30,310 --> 02:43:34,810 Back in the world of C now, we've\nseen strings, which are words. 2865 02:43:34,810 --> 02:43:38,351 I mentioned int or integer,\nbut there's others as well. 2866 02:43:38,351 --> 02:43:42,581 In the world of C, we've\nseen string, we will see int. 2867 02:43:42,581 --> 02:43:46,031 If you want a bigger integer, there's\n 2868 02:43:46,031 --> 02:43:49,451 If you want a single character,\nthere's something called a char. 2869 02:43:49,450 --> 02:43:53,680 If you want a Boolean value,\ntrue or false, there is a bool. 2870 02:43:53,681 --> 02:43:56,111 And if you want a floating-point value-- 2871 02:43:56,111 --> 02:43:59,781 a fancy way of saying a real number,\n 2872 02:43:59,781 --> 02:44:03,201 that is what C and other\nlanguages call a float. 2873 02:44:03,200 --> 02:44:06,610 And if you want even more numbers\nafter the decimal point that 2874 02:44:06,611 --> 02:44:10,211 is more precision, you can\nuse something called a double. 2875 02:44:10,210 --> 02:44:14,020 That is to say, here is, again,\nan example in programming 2876 02:44:14,021 --> 02:44:17,681 where it's up to you now to provide\n 2877 02:44:17,681 --> 02:44:21,640 that it will rely on to know what\n 2878 02:44:23,331 --> 02:44:26,621 Is it a sound, an image,\na color, or the like? 2879 02:44:26,620 --> 02:44:30,970 These are the types of data types\n 2880 02:44:30,970 --> 02:44:35,800 What are the functions that come in\n 2881 02:44:35,800 --> 02:44:39,251 We talked about standard I/O, and\n 2882 02:44:39,251 --> 02:44:42,911 In the CS50 library, you can\nsee that it follows a pattern. 2883 02:44:42,911 --> 02:44:45,581 The C50 library exists largely\nfor the first few weeks 2884 02:44:45,581 --> 02:44:50,997 of the class to make our lives easier\n 2885 02:44:50,997 --> 02:44:53,831 So if you want to get a string,\n 2886 02:44:54,700 --> 02:44:57,742 If you want to get an integer from\n 2887 02:44:57,743 --> 02:45:01,091 When you want to get any of those\n 2888 02:45:03,861 --> 02:45:06,221 And they're indeed all\nlowercase by convention. 2889 02:45:07,331 --> 02:45:10,630 If we have the ability now to\nstore different types of data 2890 02:45:10,630 --> 02:45:13,751 and we have functions with which\nto get different types of data 2891 02:45:13,751 --> 02:45:16,841 how might you go about printing\ndifferent types of data? 2892 02:45:16,841 --> 02:45:23,800 Well, we've seen %s for string,\n%i for integer, %c for char 2893 02:45:23,800 --> 02:45:30,130 %f for a float or a double, those\n 2894 02:45:30,130 --> 02:45:33,760 and then %li for a long integer. 2895 02:45:33,761 --> 02:45:36,281 So here's the first\nexample of inconsistencies. 2896 02:45:36,281 --> 02:45:38,981 In an ideal world, that would\njust be %l and we'd move on. 2897 02:45:38,980 --> 02:45:42,940 It's %li instead in this case. 2898 02:45:42,941 --> 02:45:45,701 That's printf and some\nof its format codes. 2899 02:45:49,511 --> 02:45:51,858 no pun intended-- there is\na whole bunch of operators. 2900 02:45:51,858 --> 02:45:54,191 And, indeed, computers, one\nof the first things they did 2901 02:45:54,191 --> 02:45:57,941 was a lot of math and calculations, so\n 2902 02:45:57,941 --> 02:46:01,161 Computers, and in turn, C, really\ngood at addition, subtraction 2903 02:46:01,161 --> 02:46:04,191 multiplication, division,\nand even the percent sign 2904 02:46:04,191 --> 02:46:05,740 which is the remainder operator. 2905 02:46:05,740 --> 02:46:08,650 There's a special symbol\nin C and other languages 2906 02:46:08,650 --> 02:46:13,161 just for getting the remainder, when\n 2907 02:46:13,161 --> 02:46:18,911 There are other features in the world\n 2908 02:46:18,911 --> 02:46:22,931 And there's also what is of\n 2909 02:46:22,931 --> 02:46:27,371 makes it easier over time\nto write fewer characters 2910 02:46:27,370 --> 02:46:29,090 but express your thoughts the same. 2911 02:46:29,091 --> 02:46:33,470 So just as a single example\nof this, as a single example 2912 02:46:33,470 --> 02:46:37,331 consider this use of\na variable last week. 2913 02:46:37,331 --> 02:46:41,740 Here in Scratch is how you might\n 2914 02:46:41,740 --> 02:46:43,960 In C, it's going to be similar. 2915 02:46:43,960 --> 02:46:46,240 If you want the variable\nto be called counter 2916 02:46:46,240 --> 02:46:49,540 you literally write the word counter,\n 2917 02:46:49,540 --> 02:46:53,350 You then use the assignment\noperator, a.k.a. the equals sign 2918 02:46:53,351 --> 02:46:56,931 and you assign it whatever its initial\n 2919 02:46:56,931 --> 02:47:01,001 So, again, the 0 is going to get copied\n 2920 02:47:01,001 --> 02:47:02,990 because of that single equal sign. 2921 02:47:02,990 --> 02:47:05,831 But this isn't sufficient\nin C. What else 2922 02:47:05,831 --> 02:47:08,890 is missing on the right-hand\nside, instinctively now? 2923 02:47:08,890 --> 02:47:11,485 Even if you've never\nprogrammed in this before. 2924 02:47:13,150 --> 02:47:14,775 DAVID J. MALAN: A semicolon at the end. 2925 02:47:14,775 --> 02:47:16,941 And one other thing, I\nthink, is probably missing. 2926 02:47:18,414 --> 02:47:19,581 DAVID J. MALAN: A data type. 2927 02:47:19,581 --> 02:47:22,671 So if we can keep going\nback and forth here 2928 02:47:22,671 --> 02:47:26,320 what data type seems appropriate\nintuitively for counter? 2929 02:47:27,431 --> 02:47:29,861 So, indeed, we need to\ntell the computer when 2930 02:47:29,861 --> 02:47:32,531 creating a variable what\ntype of data we want 2931 02:47:32,531 --> 02:47:35,721 and we need to finish our\nthought with the semicolon. 2932 02:47:35,720 --> 02:47:38,140 So there might be a counterpart there. 2933 02:47:38,140 --> 02:47:43,270 What about in Scratch if we wanted\n 2934 02:47:43,271 --> 02:47:45,671 We had this very user-friendly\npuzzle piece last time 2935 02:47:45,671 --> 02:47:49,740 that was change counter\nby 1, or add 1 to counter. 2936 02:47:49,740 --> 02:47:54,120 In C, here's where things get\na little more interesting. 2937 02:47:54,120 --> 02:47:58,110 And pretty commonly done, you might\n 2938 02:47:59,310 --> 02:48:02,220 And this is where, again, it's\n 2939 02:48:03,361 --> 02:48:05,341 Otherwise, this makes no sense. 2940 02:48:05,341 --> 02:48:08,251 counter cannot equal\ncounter plus 1, right? 2941 02:48:08,251 --> 02:48:11,171 That just doesn't work if we're\ntalking about integers here. 2942 02:48:11,171 --> 02:48:13,411 That's because the equal\nsign is assignment. 2943 02:48:13,411 --> 02:48:15,751 So it can certainly be the\ncase that you calculate 2944 02:48:15,751 --> 02:48:19,921 counter plus 1, whatever that is,\n 2945 02:48:19,921 --> 02:48:22,871 from right to left to be that new value. 2946 02:48:22,870 --> 02:48:25,200 This, as we'll see,\nis a very common thing 2947 02:48:25,200 --> 02:48:29,100 to do in programming just to kind of\n 2948 02:48:29,101 --> 02:48:30,931 You can write this more succinctly. 2949 02:48:30,931 --> 02:48:34,711 This code here is what we'll\ncall syntactic sugar, sort 2950 02:48:34,710 --> 02:48:39,540 of a fancy way of saying the same thing\n 2951 02:48:40,320 --> 02:48:44,220 This also adds 1, or whatever\nnumber you type over here 2952 02:48:44,220 --> 02:48:46,081 to the variable on the left. 2953 02:48:46,081 --> 02:48:49,771 And there's one other form of syntactic\n 2954 02:48:49,771 --> 02:48:51,601 and it's even more terse than this. 2955 02:48:51,601 --> 02:48:56,851 That too will increment counter by 1\n 2956 02:48:56,851 --> 02:48:59,701 Or if you change it to minus\nminus, subtracting 1 from it. 2957 02:48:59,700 --> 02:49:02,310 You can't do that with\n2 and 3 and 4, but you 2958 02:49:02,310 --> 02:49:07,860 can do it by default with just plus plus\n 2959 02:49:12,611 --> 02:49:15,641 DAVID J. MALAN: Ah, so when you are\n 2960 02:49:15,640 --> 02:49:20,200 has been created, as we did with\nthe code that looked like this 2961 02:49:20,200 --> 02:49:23,590 you no longer need to remind the\ncomputer what the data type is. 2962 02:49:23,591 --> 02:49:27,220 Thankfully, the computer is\nat least as smart as that. 2963 02:49:27,220 --> 02:49:31,990 It will remember the type of\nthe data that you intended. 2964 02:49:31,990 --> 02:49:34,941 Other questions or comments on this? 2965 02:49:34,941 --> 02:49:36,191 All right, that's quite a lot. 2966 02:49:36,191 --> 02:49:38,570 Why don't we go ahead and\nhere take a 10-minute break? 2967 02:49:38,570 --> 02:49:41,771 And we'll be back, we'll\nstart writing some code. 2968 02:49:44,980 --> 02:49:48,581 We've just looked at some\nof the basics of compiling 2969 02:49:48,581 --> 02:49:50,335 even if it doesn't\nquite feel that basic. 2970 02:49:50,335 --> 02:49:52,210 But now, let's actually\nstart focusing really 2971 02:49:52,210 --> 02:49:55,510 on writing more and more code,\nmore and more interesting 2972 02:49:55,511 --> 02:49:58,701 code, kind of like we dove\ninto Scratch last week. 2973 02:49:58,700 --> 02:50:00,610 So here I have these code open. 2974 02:50:01,570 --> 02:50:04,361 I'm going to focus more on my\n 2975 02:50:04,361 --> 02:50:07,444 Many different ways I can create new\n 2976 02:50:08,511 --> 02:50:11,501 So, again, within this\nenvironment of VS Code 2977 02:50:11,501 --> 02:50:15,581 I can literally write the code\n 2978 02:50:15,581 --> 02:50:18,371 and it just creates a new\nfile for me automatically. 2979 02:50:18,370 --> 02:50:20,411 Or I could do that in the GUI. 2980 02:50:20,411 --> 02:50:23,382 I'm going to go ahead and create\nthis file called calculator.c 2981 02:50:23,382 --> 02:50:25,841 and I'm going to go ahead and\ninclude some familiar things. 2982 02:50:25,841 --> 02:50:30,851 So I'm just going to go ahead and\n 2983 02:50:30,851 --> 02:50:33,911 I'm going to go ahead from\nmemory and do the int void main-- 2984 02:50:33,911 --> 02:50:38,248 more on that next week, why it's\n 2985 02:50:38,248 --> 02:50:40,540 And now let me just implement\na very simple calculator. 2986 02:50:40,540 --> 02:50:44,360 We saw some mathematical\noperators, like plus and the like. 2987 02:50:45,861 --> 02:50:48,671 So let me go ahead and\nfirst give myself a variable 2988 02:50:48,671 --> 02:50:52,480 called x, sort of like grade\nschool math or algebra. 2989 02:50:52,480 --> 02:50:55,570 Let me go ahead then and\nget an int, which is new 2990 02:50:55,570 --> 02:50:57,251 but I mentioned this exists. 2991 02:50:57,251 --> 02:51:00,640 And then let me just ask the user\nfor whatever their x value is. 2992 02:51:00,640 --> 02:51:03,984 The thing in the quotes is\njust the English, or the string 2993 02:51:03,984 --> 02:51:06,650 that I'm printing on the screen.\nso I could say anything I want. 2994 02:51:06,650 --> 02:51:09,931 I'm just going to say x colon\nto prompt the user accordingly. 2995 02:51:09,931 --> 02:51:12,431 Now I'm going to go ahead and\nget another variable called y. 2996 02:51:13,630 --> 02:51:16,270 And now, I'm going to\nprompt the user for y. 2997 02:51:16,271 --> 02:51:18,551 And I'm just very nitpickly\nusing a space just 2998 02:51:18,550 --> 02:51:22,030 to move the cursor so it doesn't\nlook too messy on the screen. 2999 02:51:22,031 --> 02:51:27,221 And then lastly, let me go ahead and\n 3000 02:51:27,220 --> 02:51:31,720 In an ideal world, I would just\nsay something like printf x + y. 3001 02:51:31,720 --> 02:51:36,640 But that is not valid in C. The\n 3002 02:51:36,640 --> 02:51:39,261 has to be a string in double quotes. 3003 02:51:39,261 --> 02:51:43,060 So if I want to print out\nthe value of an integer 3004 02:51:43,060 --> 02:51:47,320 I need to put something in quotes\n 3005 02:51:47,320 --> 02:51:49,220 if I want to move the cursor as well. 3006 02:51:49,220 --> 02:51:51,640 So, again, we only glimpsed\nit briefly, but what 3007 02:51:51,640 --> 02:51:55,480 do I replace these question marks with\n 3008 02:51:56,570 --> 02:51:57,980 DAVID J. MALAN: Yeah, so %i. 3009 02:51:57,980 --> 02:52:00,591 Just like %s was string, %i is integer. 3010 02:52:02,361 --> 02:52:06,351 And now if I want to add x and y, for\n 3011 02:52:06,351 --> 02:52:09,771 doesn't do much of anything other\n 3012 02:52:11,341 --> 02:52:14,091 And, again, it looks definitely\ncryptic at first glance. 3013 02:52:14,091 --> 02:52:16,490 It would be if programming\nweren't this cryptic. 3014 02:52:16,490 --> 02:52:18,511 Other languages will\nclean this up for us. 3015 02:52:18,511 --> 02:52:22,191 But, again, if you focus on the\n 3016 02:52:22,191 --> 02:52:26,150 which is a format string with\nEnglish or whatever language 3017 02:52:27,591 --> 02:52:31,550 then it takes potentially more\narguments after the comma 3018 02:52:33,921 --> 02:52:36,800 All right, let me go ahead\nnow and make calculator 3019 02:52:36,800 --> 02:52:41,300 which, again, compiles\nmy source code in C 3020 02:52:41,300 --> 02:52:44,480 pictured above, and converts\nit into corresponding machine 3021 02:52:50,720 --> 02:52:53,841 Let's do 1 plus 1 and Enter. 3022 02:52:54,681 --> 02:52:57,531 Now I have the makings of a calculator. 3023 02:52:57,531 --> 02:53:00,421 Now let's start to tinker\nwith this a little bit. 3024 02:53:00,421 --> 02:53:02,271 What if I instead had done this? 3025 02:53:02,271 --> 02:53:08,150 int z = x + y and then plug-in z here. 3026 02:53:08,150 --> 02:53:14,841 If I rerun make calculator, Enter,\n 3027 02:53:14,841 --> 02:53:20,791 still equals 2, and let me claim that\n 3028 02:53:20,790 --> 02:53:22,820 which of these versions\nis better designed? 3029 02:53:22,820 --> 02:53:27,650 If both seem to be correct at very\n 3030 02:53:27,650 --> 02:53:30,710 or is the previous one without the z? 3031 02:53:30,710 --> 02:53:34,190 OK, so this one is arguably better\n 3032 02:53:34,191 --> 02:53:36,501 variable called z that I\ncannot only print but, heck 3033 02:53:36,501 --> 02:53:39,300 if my program is longer,\nI can use it elsewhere. 3034 02:53:42,272 --> 02:53:44,680 Debatable, like before, because\nit depends on my intent. 3035 02:53:44,681 --> 02:53:46,556 And, honestly, I think\na pretty good argument 3036 02:53:46,556 --> 02:53:48,001 can be made for the first version. 3037 02:53:48,001 --> 02:53:50,851 Because if I have no\nintention of-- as you note-- 3038 02:53:50,851 --> 02:53:53,801 using that variable\nagain, you know what? 3039 02:53:53,800 --> 02:53:55,800 Maybe I might as well do\nthis, just because it's 3040 02:53:55,800 --> 02:53:57,331 one less thing to think about. 3041 02:53:58,441 --> 02:54:00,900 It's one less line of code\nto have to understand. 3042 02:54:02,040 --> 02:54:04,770 So here, again, it does\ndepend on your intention. 3043 02:54:04,771 --> 02:54:07,021 But this field is pretty reasonable. 3044 02:54:07,021 --> 02:54:09,301 And I think, as someone\nnoted earlier, when 3045 02:54:09,300 --> 02:54:13,411 I did the same thing with get_string,\n 3046 02:54:13,411 --> 02:54:16,335 s line because get_string and the\nwhat's your name inside of it 3047 02:54:16,335 --> 02:54:17,460 it was just so much longer. 3048 02:54:17,460 --> 02:54:20,850 But x + y, eh, it's not that hard\nto wrap our mind around what's 3049 02:54:20,851 --> 02:54:23,060 going on inside of the printf argument. 3050 02:54:23,060 --> 02:54:25,831 So, again, these are the kinds\nof thoughts that hopefully you'll 3051 02:54:25,831 --> 02:54:28,531 acquire the instinct for\non not necessarily reaching 3052 02:54:28,531 --> 02:54:30,990 the same answer as someone\nelse, but, again, the thought 3053 02:54:30,990 --> 02:54:33,220 process is what matters here. 3054 02:54:33,220 --> 02:54:36,180 All right, so how might I enhance\nthis program a little bit? 3055 02:54:36,181 --> 02:54:38,471 Let's just talk about\nstyle for just a moment. 3056 02:54:38,470 --> 02:54:42,990 So x and y, at least in this case,\n 3057 02:54:43,500 --> 02:54:45,750 Because that's the go-to\nvariable names in math 3058 02:54:45,750 --> 02:54:47,375 when you're adding two things together. 3059 02:54:47,375 --> 02:54:48,909 So x and y seem pretty reasonable. 3060 02:54:48,909 --> 02:54:53,159 I could have done something like,\nwell, maybe my first variable 3061 02:54:53,159 --> 02:54:56,730 should be called first\nnumber and my next variable 3062 02:54:56,730 --> 02:54:58,552 should be called second number. 3063 02:54:58,552 --> 02:55:00,511 And then down here, I\nwould have to change this 3064 02:55:00,511 --> 02:55:04,329 to first number plus second number. 3065 02:55:04,329 --> 02:55:07,050 Like, eh, this isn't really\nadding anything semantically 3066 02:55:08,341 --> 02:55:11,230 But that would be one other\n 3067 02:55:11,230 --> 02:55:14,761 So if you have very simple\nideas that are conventionally 3068 02:55:14,761 --> 02:55:18,790 expressed with common variable names\n 3069 02:55:18,790 --> 02:55:22,591 What if I want to annotate this program\n 3070 02:55:22,591 --> 02:55:25,381 Well, I can add in C\nwhat are called comments. 3071 02:55:25,380 --> 02:55:30,030 With a slash slash, two forward slashes,\n 3072 02:55:32,579 --> 02:55:35,400 And then down here, I could\ndo something like prompt user 3073 02:55:35,400 --> 02:55:37,619 for y, just to remind\nmyself what I'm doing there. 3074 02:55:37,620 --> 02:55:40,171 And down here, perform addition. 3075 02:55:40,171 --> 02:55:42,990 Now, in this case, I'm not\nsure these commands are really 3076 02:55:44,101 --> 02:55:47,820 Because in the time it took me to write\n 3077 02:55:47,820 --> 02:55:49,950 I could have just read\nthe three lines of code. 3078 02:55:49,950 --> 02:55:52,890 But as our programs\nget more sophisticated 3079 02:55:52,890 --> 02:55:55,548 and you start to learn more syntax-- 3080 02:55:55,549 --> 02:55:58,091 that, honestly, you might forget\nthe next day, the next week 3081 02:55:58,091 --> 02:56:01,890 the next month-- might be useful\n 3082 02:56:01,890 --> 02:56:04,681 reminds you of what your\ncode is doing or maybe even 3083 02:56:06,421 --> 02:56:09,329 With these early programs,\nnot really necessary 3084 02:56:09,329 --> 02:56:11,671 doesn't really add all that\nmuch to our comprehension 3085 02:56:11,671 --> 02:56:14,159 but it is a mechanism\nyou have in place that 3086 02:56:14,159 --> 02:56:18,060 can help you actually remind\nyourself or remind someone 3087 02:56:18,060 --> 02:56:20,279 else what it is that's going on. 3088 02:56:20,279 --> 02:56:23,070 Well, let me go ahead and rerun\n 3089 02:56:24,431 --> 02:56:27,001 And here, too, you might\nthink I'm typing crazy fast-- 3090 02:56:28,771 --> 02:56:32,201 So it turns out that\nLinux, the operating system 3091 02:56:32,200 --> 02:56:33,510 we're using here in the cloud-- 3092 02:56:33,511 --> 02:56:36,961 but, actually, Windows and Mac\nOS nowadays support this too-- 3093 02:56:38,800 --> 02:56:42,300 So if you only have one\nprogram that starts with C-A-L 3094 02:56:42,300 --> 02:56:45,210 you don't have to finish writing\n 3095 02:56:45,210 --> 02:56:47,770 and the computer will\nfinish your thought for you. 3096 02:56:47,771 --> 02:56:51,931 The other thing you can do is\nif you hit Up and keep going up 3097 02:56:51,931 --> 02:56:54,424 you'll scroll through your\nentire history of commands. 3098 02:56:54,424 --> 02:56:56,341 So there too, I've been\nsaving some keystrokes 3099 02:56:56,341 --> 02:56:59,174 by hitting Up quickly rather than\n 3100 02:56:59,831 --> 02:57:02,130 So, again, just another\nlittle convenience 3101 02:57:02,130 --> 02:57:05,911 to make programming and interacting with\n 3102 02:57:05,911 --> 02:57:09,281 All right, let me go ahead and just make\n 3103 02:57:09,281 --> 02:57:11,131 The comments have no functional impact. 3104 02:57:11,130 --> 02:57:13,230 These green things are\njust notes to self. 3105 02:57:13,230 --> 02:57:15,421 Let me run calculator with\nmaybe-- how about this? 3106 02:57:15,421 --> 02:57:19,441 Instead of 1 plus 1,\nhow about 1 billion-- 3107 02:57:22,004 --> 02:57:23,171 whoops, let's do that again. 3108 02:57:24,460 --> 02:57:30,175 1 million, 1 billion, and another 1\n 3109 02:57:30,175 --> 02:57:31,550 All right, so that seems correct. 3110 02:57:31,550 --> 02:57:33,092 Let's run this program one more time. 3111 02:57:33,093 --> 02:57:39,281 How about 2 billion\nplus another 2 billion? 3112 02:57:42,120 --> 02:57:45,120 So, apparently, it's not so correct. 3113 02:57:45,120 --> 02:57:49,590 And, clearly, running 1 plus 1 was\n 3114 02:57:50,550 --> 02:57:53,235 What might have gone wrong? 3115 02:57:53,236 --> 02:57:54,361 What might have gone wrong? 3116 02:57:58,341 --> 02:58:00,511 The computer probably ran\nout of space with bits. 3117 02:58:00,511 --> 02:58:05,271 So it turns out with these data types--\n 3118 02:58:05,271 --> 02:58:09,650 and also float and char and those\n 3119 02:58:09,650 --> 02:58:13,280 and, most importantly, finite\nnumber of bits to represent them. 3120 02:58:14,960 --> 02:58:18,890 Newer computers use more bits, older\n 3121 02:58:18,890 --> 02:58:21,841 It's not necessarily standardized\nfor all of these data types. 3122 02:58:21,841 --> 02:58:28,161 But in this case, in this environment,\n 3123 02:58:28,835 --> 02:58:30,980 So with 32 bits, you\ncan count pretty high. 3124 02:58:30,980 --> 02:58:34,161 This is 64 light bulbs on the\nstage and could count even higher. 3125 02:58:34,161 --> 02:58:38,331 An int is only using half of these, or\n 3126 02:58:38,331 --> 02:58:42,351 Now, if you think back to last week,\n 3127 02:58:42,351 --> 02:58:47,371 And if you have 8 bits, 8 zeros and\n 3128 02:58:47,370 --> 02:58:49,700 just a good number to\ngenerally remember as trivia. 3129 02:58:49,700 --> 02:58:53,540 8 bits gives you 256\npermutations of zeros and ones. 3130 02:58:53,540 --> 02:58:57,440 32 gives you roughly how\nmany, if anyone knows? 3131 02:58:59,841 --> 02:59:02,511 So it's roughly 4 billion, 2 to the 32. 3132 02:59:02,511 --> 02:59:04,310 If you don't know that, it's fine. 3133 02:59:04,310 --> 02:59:07,310 Most programmers, though, eventually\n 3134 02:59:10,790 --> 02:59:13,880 2 billion plus 2 billion\nis exactly 4 billion. 3135 02:59:13,880 --> 02:59:17,960 And that actually should\nfit in a 32-bit integer. 3136 02:59:17,960 --> 02:59:20,751 The catch is that my Mac,\nyour PC, and the like 3137 02:59:20,751 --> 02:59:22,791 also like to support negative numbers. 3138 02:59:22,790 --> 02:59:26,150 And if you want to support both positive\n 3139 02:59:26,150 --> 02:59:29,271 means with 32-bit integers,\nyou can count as high 3140 02:59:29,271 --> 02:59:33,543 as roughly 2 billion positive\nor 2 billion negative 3141 02:59:34,501 --> 02:59:38,060 That's still 4 billion, give or\ntake, but it's only half as many 3142 02:59:38,060 --> 02:59:39,661 in one direction or the other. 3143 02:59:39,661 --> 02:59:44,210 So how could I go about implementing\na correct calculator here? 3144 02:59:47,601 --> 02:59:50,181 Yeah, so not just li,\nwhich was for long integer. 3145 02:59:50,181 --> 02:59:54,631 I have to make one more change,\n 3146 02:59:54,630 --> 02:59:59,090 So let me go back up here and change\n 3147 03:00:00,230 --> 03:00:02,400 And then let me change y as well. 3148 03:00:02,400 --> 03:00:05,900 And then let me change the format code\n 3149 03:00:07,970 --> 03:00:10,581 Let me recompile the calculator-- 3150 03:00:14,060 --> 03:00:15,601 That's should obviously be the same. 3151 03:00:15,601 --> 03:00:20,701 Now let's do 2 billion\nand another 2 billion 3152 03:00:20,700 --> 03:00:22,330 and cross our fingers this time. 3153 03:00:22,331 --> 03:00:24,720 Now we're counting as high as 4 billion. 3154 03:00:24,720 --> 03:00:27,870 And we can go way higher than\n4 billion, but we're only 3155 03:00:27,870 --> 03:00:29,760 kicking the can down the street a bit. 3156 03:00:29,761 --> 03:00:31,261 Even though we're now using-- 3157 03:00:32,431 --> 03:00:37,781 64 bits, which is as long as this\n 3158 03:00:37,781 --> 03:00:40,381 It might be a really big\nvalue, but it's still finite. 3159 03:00:40,380 --> 03:00:42,810 And we'll come back at the\nend of today to these kinds 3160 03:00:44,040 --> 03:00:48,310 Because arguably now, my calculator\n 3161 03:00:48,310 --> 03:00:51,280 billions of possible inputs but not all. 3162 03:00:51,281 --> 03:00:53,401 And that's problematic\nif you actually want 3163 03:00:53,400 --> 03:00:58,740 to use my calculator for any\npossible inputs, not just 3164 03:00:58,740 --> 03:01:03,300 ones that are roughly less than,\n 3165 03:01:03,300 --> 03:01:05,292 All right, any questions then on that? 3166 03:01:05,292 --> 03:01:07,501 But it's really just a\nprecursor for all the problems 3167 03:01:07,501 --> 03:01:10,505 that we're going to have to\neventually deal with later on. 3168 03:01:15,688 --> 03:01:17,021 DAVID J. MALAN: A good question. 3169 03:01:17,261 --> 03:01:20,351 If we were still using z, we would\n 3170 03:01:20,351 --> 03:01:23,471 Otherwise, we'd be ignoring\n32 of the bits that 3171 03:01:23,470 --> 03:01:25,990 had been added together via the longs. 3172 03:01:27,831 --> 03:01:32,351 All right, so how about we spice things\n 3173 03:01:32,351 --> 03:01:35,711 how about something\nwith some conditions? 3174 03:01:35,710 --> 03:01:37,630 Let's start to ask\nsome actual questions. 3175 03:01:37,630 --> 03:01:44,090 So a moment ago, recall that we had\n 3176 03:01:44,091 --> 03:01:46,091 Now let's look back at\nsomething in Scratch that 3177 03:01:46,091 --> 03:01:48,632 looked a little something like\nthis, a bunch of puzzle pieces 3178 03:01:48,632 --> 03:01:50,710 asking questions by way\nof these conditionals 3179 03:01:50,710 --> 03:01:53,710 and then these Boolean expressions\n 3180 03:01:55,240 --> 03:01:58,300 In C, this actually maps pretty cleanly. 3181 03:01:58,300 --> 03:02:02,290 It's much cleaner from left to right\n 3182 03:02:02,290 --> 03:02:04,990 Here, we have just code\nthat looks like this. 3183 03:02:04,990 --> 03:02:09,251 If, a space, two parentheses\nand then x less than y 3184 03:02:09,251 --> 03:02:12,224 and then we have something like\nprintf there in the middle. 3185 03:02:12,224 --> 03:02:14,140 So here, it's actually\nkind of a nice mapping. 3186 03:02:14,140 --> 03:02:16,511 Notice that, just as the\nyellow puzzle piece in Scratch 3187 03:02:16,511 --> 03:02:18,644 is kind of hugging the\npurple puzzle piece 3188 03:02:18,644 --> 03:02:21,310 that's effectively the role that\nthese curly braces are playing. 3189 03:02:21,310 --> 03:02:24,700 They're sort of encapsulating\nall of the code on the inside. 3190 03:02:24,700 --> 03:02:27,850 The parentheses represent\nthe Boolean expression 3191 03:02:27,851 --> 03:02:32,411 that needs to be asked and answered to\n 3192 03:02:32,411 --> 03:02:34,841 And here's an exception to\nwhat I alluded to earlier. 3193 03:02:34,841 --> 03:02:38,890 Usually, when you see a word and\nthen a parenthesis, something 3194 03:02:38,890 --> 03:02:42,130 and then closed parenthesis, I\n 3195 03:02:42,130 --> 03:02:44,320 And I'm still feeling pretty\ngood about that claim. 3196 03:02:45,761 --> 03:02:48,281 And the word if is not a function. 3197 03:02:48,281 --> 03:02:50,501 It's just a programming construct. 3198 03:02:50,501 --> 03:02:55,421 It's a feature of the C language\n 3199 03:02:55,421 --> 03:02:58,001 for different purposes\nfor a Boolean expression. 3200 03:02:58,001 --> 03:02:59,411 How about something like this? 3201 03:02:59,411 --> 03:03:02,501 Last week, if you wanted to\nhave a two-way fork in the road 3202 03:03:02,501 --> 03:03:05,501 go this way or that way,\nyou can have if and else. 3203 03:03:05,501 --> 03:03:08,091 In C, that would look a\nlittle something like this. 3204 03:03:08,091 --> 03:03:11,111 And if we add in the printf's,\nit now looks quite like the same 3205 03:03:11,111 --> 03:03:15,041 but it adds, of course, the word else\n 3206 03:03:15,040 --> 03:03:21,190 As an aside, in C, It's not strictly\n 3207 03:03:21,191 --> 03:03:25,781 if you have only one line\nof code indented underneath. 3208 03:03:25,781 --> 03:03:30,068 For best practice, though, do so anyway,\n 3209 03:03:30,067 --> 03:03:31,900 and ultimately anyone\nelse reading your code 3210 03:03:31,900 --> 03:03:35,411 that you intend for just that one\n 3211 03:03:35,411 --> 03:03:37,060 How about this from last week? 3212 03:03:37,060 --> 03:03:38,681 Here was a three-way fork in the road. 3213 03:03:38,681 --> 03:03:45,073 If x is less than y, else if x is\n 3214 03:03:45,073 --> 03:03:47,531 Now, here's where you have some\ndisparities between Scratch 3215 03:03:47,531 --> 03:03:52,751 and C. Scratch uses an equals sign\n 3216 03:03:52,751 --> 03:03:55,841 C uses a single equals sign\nfor assignment from right 3217 03:03:55,841 --> 03:03:58,611 to left, minor difference\nbetween the two worlds. 3218 03:03:58,611 --> 03:04:02,831 In C, we could implement the same\n 3219 03:04:02,831 --> 03:04:04,331 just this additional else if. 3220 03:04:04,331 --> 03:04:08,695 And if we add in the printf's, it\n 3221 03:04:08,695 --> 03:04:12,460 This is correct both in the\nScratch world and in the C world. 3222 03:04:12,460 --> 03:04:16,420 But could someone make a claim that\n 3223 03:04:18,581 --> 03:04:21,730 We need the else, at least,\nbut we don't need the last if. 3224 03:04:21,730 --> 03:04:24,220 Because, at least in the\nworld of comparing integers 3225 03:04:24,220 --> 03:04:27,220 it's either going to be less\nthan, greater than, or equal to. 3226 03:04:28,730 --> 03:04:31,331 So you can save a few\nseconds, if you will 3227 03:04:31,331 --> 03:04:35,380 of your program running-- a blink of\n 3228 03:04:35,380 --> 03:04:37,870 and then inferring what\nthe answer to the third 3229 03:04:37,870 --> 03:04:41,480 must be just by nature of\nyour own human logic here. 3230 03:04:41,480 --> 03:04:42,820 Now, why is that a good thing? 3231 03:04:42,820 --> 03:04:46,570 If, for instance, x and y\nhappen to equal each other-- 3232 03:04:46,570 --> 03:04:51,341 I type in 1 and 1 for both values,\n 3233 03:04:51,341 --> 03:04:55,570 in the case of this version,\nyou're sort of stupidly 3234 03:04:55,570 --> 03:04:59,021 asking three questions, all of\nwhich are going to get asked 3235 03:04:59,021 --> 03:05:02,320 even though the answer is no, no, yes. 3236 03:05:05,800 --> 03:05:09,640 That seems to be unnecessary because\n 3237 03:05:09,640 --> 03:05:13,450 get rid of the unnecessary if and\n 3238 03:05:13,450 --> 03:05:16,480 else print that x is equal to y-- 3239 03:05:16,480 --> 03:05:20,960 now if x indeed equals y because\n 3240 03:05:20,960 --> 03:05:26,200 now you're only going to ask two\n 3241 03:05:26,200 --> 03:05:29,480 and then you're going to get\nyour same correct result. 3242 03:05:29,480 --> 03:05:32,140 So, again, a minor detail,\nbut, again, the kinds of things 3243 03:05:32,140 --> 03:05:33,849 you should be thinking\nabout, not only as 3244 03:05:33,849 --> 03:05:36,431 you write your code to\nbe correct but also write 3245 03:05:36,431 --> 03:05:38,831 it to be well-designed as well. 3246 03:05:38,831 --> 03:05:41,081 All right, so why don't we\ngo ahead and translate this 3247 03:05:41,081 --> 03:05:44,650 into the context of an\nactual program here? 3248 03:05:44,650 --> 03:05:46,570 I'll create a blank window here. 3249 03:05:46,570 --> 03:05:50,441 And let's do something with points,\n 3250 03:05:51,521 --> 03:05:54,641 Let me go ahead and\nrun code of points.c. 3251 03:05:54,640 --> 03:05:56,710 That's just going to\ngive me a new text file. 3252 03:05:56,710 --> 03:06:00,760 And then up here, I'm going to\ndo my usual, include cs50.h. 3253 03:06:04,781 --> 03:06:08,231 So a lot of boilerplate, so to\nspeak, in these early programs. 3254 03:06:09,521 --> 03:06:12,820 Let's ask the user, how\nmany points did they 3255 03:06:12,820 --> 03:06:15,060 lose on their most recent CS50 PSet? 3256 03:06:15,060 --> 03:06:19,040 So sort of evoke my photograph of\n 3257 03:06:19,040 --> 03:06:20,690 where I lost a couple of points myself. 3258 03:06:23,800 --> 03:06:27,740 Then I'll ask a question in English\n 3259 03:06:29,300 --> 03:06:33,600 And then once I have this answer,\n 3260 03:06:33,601 --> 03:06:36,861 So if points is less than 2-- 3261 03:06:36,861 --> 03:06:40,521 borrowing the syntax that we\nsaw on the screen a moment ago-- 3262 03:06:40,521 --> 03:06:42,921 let's go ahead and print\nout something explanatory 3263 03:06:42,921 --> 03:06:49,191 like you lost fewer points\nthan me, backslash n. 3264 03:06:49,191 --> 03:06:52,221 else if points greater than 2-- 3265 03:06:52,220 --> 03:06:53,990 which is, again how many I lost-- 3266 03:06:53,990 --> 03:06:59,661 I'm going to go ahead and print out you\n 3267 03:06:59,661 --> 03:07:03,847 else if-- wait a minute, else seems\n 3268 03:07:03,847 --> 03:07:05,931 I'm just going to go ahead\nand print out something 3269 03:07:05,931 --> 03:07:12,621 like you lost the same number\nof points as me, backslash n. 3270 03:07:12,620 --> 03:07:16,790 So, really, just a straightforward\n 3271 03:07:16,790 --> 03:07:18,960 but to a concrete scenario here. 3272 03:07:18,960 --> 03:07:21,090 So let me go ahead and save this. 3273 03:07:21,091 --> 03:07:24,140 Let me go ahead and\nrun make points, Enter. 3274 03:07:27,021 --> 03:07:29,331 And then, how many points did you lose? 3275 03:07:31,102 --> 03:07:32,810 All right, you lost\nfewer points than me. 3276 03:07:37,081 --> 03:07:40,851 So, again, we have the ability to\n 3277 03:07:40,851 --> 03:07:44,841 from last week in reality, which is\n 3278 03:07:46,111 --> 03:07:50,811 There's something subtle here, though,\n 3279 03:07:50,810 --> 03:07:53,390 that someone might call a magic number. 3280 03:07:53,390 --> 03:07:56,581 This is programming speak\nfor something I've done here. 3281 03:07:56,581 --> 03:08:01,880 There's a bit of redundancy unrelated\n 3282 03:08:01,880 --> 03:08:07,470 But is there something I typed twice\n 3283 03:08:07,470 --> 03:08:11,430 Exactly, I've hard-coded, so to speak,\n 3284 03:08:11,431 --> 03:08:13,771 in two locations, in this case-- 3285 03:08:13,771 --> 03:08:15,701 that did not come from the user. 3286 03:08:15,700 --> 03:08:18,600 So, apparently, once I\ncompile this, this is it. 3287 03:08:18,601 --> 03:08:21,671 You're always comparing\nyourself to me in like, 1996 3288 03:08:21,671 --> 03:08:24,630 which for better or for worse,\nis all the program can do. 3289 03:08:24,630 --> 03:08:27,930 But this is an example too of\na magic number in the sense 3290 03:08:27,931 --> 03:08:31,691 like, wait, where did that 2 come\n 3291 03:08:31,691 --> 03:08:35,251 It feels like we are setting the\n 3292 03:08:35,251 --> 03:08:36,480 of screwing up down the road. 3293 03:08:36,480 --> 03:08:39,931 Because the longer this code gets,\n 3294 03:08:42,900 --> 03:08:45,030 am I going to keep typing the number 2? 3295 03:08:47,911 --> 03:08:49,861 But, honestly, eventually,\nyou're going to screw up 3296 03:08:49,861 --> 03:08:52,411 and you're going to miss one of the\n 3297 03:08:52,411 --> 03:08:54,911 because maybe I did worse the\nnext year, or 1, I did better. 3298 03:08:54,911 --> 03:08:57,550 And you don't want these\nnumbers to get out of sync. 3299 03:08:57,550 --> 03:09:01,050 So what would be a logical\nimprovement to this design 3300 03:09:01,050 --> 03:09:04,110 rather than hard-coding the\nsame number sort of magically 3301 03:09:07,011 --> 03:09:09,689 Yeah, why don't I make a\nvariable that I can use in there? 3302 03:09:09,689 --> 03:09:11,480 So, for instance, I\ncould create a variable 3303 03:09:11,480 --> 03:09:14,540 like this, another integer called mine. 3304 03:09:14,540 --> 03:09:16,370 And I'm just going to\ninitialize it to 2. 3305 03:09:16,370 --> 03:09:19,370 And then I'm going to change\nmentions of 2 to this. 3306 03:09:19,370 --> 03:09:23,000 And mine is a pretty reasonable\nname for a variable insofar 3307 03:09:23,001 --> 03:09:27,261 as it refers to exactly\nwhose points are in question. 3308 03:09:27,261 --> 03:09:29,781 There's a risk here,\nthough, minor though it is. 3309 03:09:29,781 --> 03:09:32,361 I could accidentally\nchange mine at some point. 3310 03:09:32,361 --> 03:09:36,451 Maybe I forget what mine represents,\n 3311 03:09:36,450 --> 03:09:39,020 So there\'s a way to tell the\ncomputer "don\'t trust me 3312 03:09:39,021 --> 03:09:40,730 because I\'m going to\nscrew up eventually 3313 03:09:40,730 --> 03:09:43,310 by making a variable constant too. 3314 03:09:43,310 --> 03:09:45,780 So a constant in a\nprogramming language-- 3315 03:09:45,781 --> 03:09:47,421 this did not exist in Scratch-- 3316 03:09:47,421 --> 03:09:50,541 is just an additional hint to the\n 3317 03:09:50,540 --> 03:09:52,130 you to program more defensively. 3318 03:09:52,130 --> 03:09:55,340 If you don't trust\nyourself necessarily to not 3319 03:09:55,341 --> 03:09:57,650 screw up later, or\nhonestly, in practice 3320 03:09:57,650 --> 03:10:01,161 if you know that number should\nnever change, make it constant 3321 03:10:01,161 --> 03:10:02,820 and never think about it again. 3322 03:10:02,820 --> 03:10:08,091 This tells the compiler to make sure\n 3323 03:10:09,740 --> 03:10:14,240 And another convention in C and other\n 3324 03:10:14,240 --> 03:10:16,692 it's often common to just\ncapitalize the variable. 3325 03:10:16,692 --> 03:10:18,650 Kind of like you're\nyelling, but it really just 3326 03:10:18,650 --> 03:10:20,003 visually makes it stand out. 3327 03:10:20,003 --> 03:10:21,711 So it's kind of like\na nice rule of thumb 3328 03:10:21,710 --> 03:10:24,590 that helps you realize, oh,\nthat must be a constant. 3329 03:10:24,591 --> 03:10:27,291 Capitalization alone does\nnot make it constant. 3330 03:10:28,880 --> 03:10:31,130 But the capitalization\nis just a visual reminder 3331 03:10:31,130 --> 03:10:35,040 that this is somewhere,\nsomehow a constant. 3332 03:10:35,040 --> 03:10:37,490 So just a minor refinement,\nbut, again, we're 3333 03:10:37,490 --> 03:10:41,450 sort of getting better at\nprogramming just by instilling 3334 03:10:43,700 --> 03:10:48,020 Questions, then, on conditionals\nin C or these constants? 3335 03:10:48,756 --> 03:10:53,810 AUDIENCE: Why do you not use\n 3336 03:10:53,810 --> 03:10:58,761 DAVID J. MALAN: Yeah, why do you\n 3337 03:10:59,361 --> 03:11:01,401 This is the way the\nlanguage was designed. 3338 03:11:03,501 --> 03:11:07,220 Generally speaking, when you're\n 3339 03:11:08,181 --> 03:11:09,861 there's no semicolons involved. 3340 03:11:09,861 --> 03:11:14,781 For now, assume that semicolons usually\n 3341 03:11:14,781 --> 03:11:18,411 That's not 100% reliable of a heuristic,\n 3342 03:11:20,720 --> 03:11:23,720 Left hand was not talking to the right\n 3343 03:11:25,191 --> 03:11:27,391 All right, so let's do something else. 3344 03:11:28,681 --> 03:11:31,761 If I have the ability to ask\nsomething conditionally-- 3345 03:11:31,761 --> 03:11:33,740 is this thing true or\nis this other thing-- 3346 03:11:33,740 --> 03:11:36,650 could I write a very simple program\n 3347 03:11:36,650 --> 03:11:39,980 tells me if a number the\nhuman types is even or odd? 3348 03:11:39,980 --> 03:11:42,390 Well, let me just get the\nframework for that in place. 3349 03:11:42,390 --> 03:11:44,990 Let me go ahead and\nwrite code of a parity-- 3350 03:11:44,990 --> 03:11:47,060 is a fancy way of saying even or odd. 3351 03:11:47,060 --> 03:11:53,511 And let me go ahead and include cs50.h,\n 3352 03:11:53,511 --> 03:11:55,521 again, more on those down the road. 3353 03:11:55,521 --> 03:11:59,781 But, for now, I'm going to go ahead\n 3354 03:11:59,781 --> 03:12:03,351 by calling get_int and asking\nthem for whatever n is. 3355 03:12:03,351 --> 03:12:07,171 And then now I'm going to\nintroduce some pseudocode. 3356 03:12:07,171 --> 03:12:08,931 So here's the first\nexample of a program 3357 03:12:08,931 --> 03:12:12,001 honestly, that I'm not\nreally sure how to proceed. 3358 03:12:12,001 --> 03:12:14,691 So let me just resort to some\npseudocode using comments. 3359 03:12:14,691 --> 03:12:17,221 Eventually, I'll get rid of\nthis and write actual code. 3360 03:12:17,220 --> 03:12:22,675 But if n is even, then print-- 3361 03:12:22,675 --> 03:12:24,050 actually, let me just print that. 3362 03:12:24,050 --> 03:12:27,411 Let me just go ahead and say\nprintf, quote unquote, "even" 3363 03:12:27,411 --> 03:12:29,630 because I know how to use printf. 3364 03:12:29,630 --> 03:12:33,230 else-- all right, I\nknow how to printf odd 3365 03:12:33,230 --> 03:12:35,900 so let me just say printf,\nquote unquote, "odd". 3366 03:12:35,900 --> 03:12:39,630 So here, I've sort of taken a bite\n 3367 03:12:39,630 --> 03:12:42,680 And let me go ahead and put\nin my little placeholders. 3368 03:12:42,681 --> 03:12:44,581 I want to do some kind of conditions. 3369 03:12:44,581 --> 03:12:49,400 So if, question marks now, let me go\n 3370 03:12:53,540 --> 03:12:55,640 I'm getting closer to solving this. 3371 03:12:55,640 --> 03:12:59,370 But I still have this\nquestion mark here. 3372 03:12:59,370 --> 03:13:06,090 How, using syntax we've seen, might\n 3373 03:13:08,120 --> 03:13:12,050 There's this little operator I\n 3374 03:13:12,050 --> 03:13:14,100 operator, that will let\nyou do exactly that. 3375 03:13:14,101 --> 03:13:16,663 If you divide any number by\n2, that mathematical heuristic 3376 03:13:16,663 --> 03:13:19,371 is going to tell you if it's even\n 3377 03:13:21,290 --> 03:13:25,040 And that's nice because the alternative\n 3378 03:13:25,040 --> 03:13:33,440 like if n == 0 or if n\nequals 2 or n equals 4-- 3379 03:13:33,441 --> 03:13:37,070 your code would be infinitely long if\n 3380 03:13:37,070 --> 03:13:43,911 But if I do n divided by 2\nand look at the remainder-- 3381 03:13:43,911 --> 03:13:47,281 it's a little cryptic, but\nthis will indeed do the trick. 3382 03:13:47,281 --> 03:13:51,471 So the percent sign is\nthe remainder operator. 3383 03:13:51,470 --> 03:13:56,790 It does numerator divided by denominator\n 3384 03:13:56,790 --> 03:13:58,980 but, rather, the remainder of that. 3385 03:13:58,980 --> 03:14:02,511 So if you divide anything by 2,\n 3386 03:14:02,511 --> 03:14:06,681 And if, indeed, 2 divides\ninto n evenly, giving you 0 3387 03:14:06,681 --> 03:14:08,060 then you're going to print even. 3388 03:14:10,070 --> 03:14:14,751 But there is something odd-- pun\n 3389 03:14:14,751 --> 03:14:20,640 What is another new piece of syntax,\n 3390 03:14:25,540 --> 03:14:28,090 And I even caught myself\nverbally saying it a moment ago 3391 03:14:28,091 --> 03:14:29,470 just because it's so ingrained. 3392 03:14:34,581 --> 03:14:36,331 DAVID J. MALAN: Yeah, if\nsomething's equivalent to the other. 3393 03:14:36,331 --> 03:14:38,480 So now this is the equality operator. 3394 03:14:38,480 --> 03:14:40,280 It's not assignment from right to left. 3395 03:14:40,281 --> 03:14:42,561 And this one too is an\nexample of, literally 3396 03:14:42,560 --> 03:14:45,440 humans not really planning\nahead, perhaps, left hand 3397 03:14:45,441 --> 03:14:47,541 not talking to right hand\nin that someone decided 3398 03:14:47,540 --> 03:14:49,161 let's use the equals\nsign for assignment. 3399 03:14:49,161 --> 03:14:52,036 And then some number of minutes or\n 3400 03:14:52,036 --> 03:14:53,841 how do we now compare for equality? 3401 03:14:55,054 --> 03:14:58,220 And if you think this is a little weird,\n 3402 03:14:58,220 --> 03:15:01,050 there's a third version where\nyou use three equal signs. 3403 03:15:01,050 --> 03:15:03,331 So, again, it's humans that\ndesign these languages. 3404 03:15:03,331 --> 03:15:06,528 So if you're ever frustrated by them,\n 3405 03:15:06,528 --> 03:15:08,361 it might just not have\nbeen the best design. 3406 03:15:08,361 --> 03:15:11,191 But we just kind of have\nto live with it ever since. 3407 03:15:11,191 --> 03:15:12,891 So let me go ahead and zoom out here. 3408 03:15:12,890 --> 03:15:15,751 Let me go ahead and make parity here. 3409 03:15:15,751 --> 03:15:20,570 So make parity-- and, again, parity\n 3410 03:15:20,570 --> 03:15:23,300 ./parity, type in a number like 2. 3411 03:15:26,421 --> 03:15:29,720 3, that's indeed odd, and so forth. 3412 03:15:29,720 --> 03:15:32,870 If we continue testing, presumably,\n 3413 03:15:34,171 --> 03:15:37,701 Let me go ahead now and let me start\n 3414 03:15:37,700 --> 03:15:40,760 because, admittedly, it's getting\n 3415 03:15:40,761 --> 03:15:42,681 all of that boilerplate at the top. 3416 03:15:42,681 --> 03:15:46,011 Let me create a program\ncalled agree.c that's 3417 03:15:46,011 --> 03:15:48,111 reminiscent of any of\nthose forms you have 3418 03:15:48,111 --> 03:15:52,291 to agree to online with a checkbox\n 3419 03:15:52,290 --> 03:15:55,460 So let me throw away all the\nguts of this main program 3420 03:15:55,460 --> 03:15:57,780 and now ask something like this. 3421 03:15:57,781 --> 03:16:01,621 Let me go ahead and prompt\nuser to agree to something. 3422 03:16:01,620 --> 03:16:06,800 I'm going to go ahead and say, how\n 3423 03:16:06,800 --> 03:16:12,110 whatever the question might be--\n 3424 03:16:12,111 --> 03:16:13,951 for yes or no, respectively. 3425 03:16:13,950 --> 03:16:16,970 So if it's only a single\ncharacter, actually, I 3426 03:16:16,970 --> 03:16:18,610 can actually get by with just get_char. 3427 03:16:18,611 --> 03:16:20,361 Not used it before,\nbut it was on our menu 3428 03:16:20,361 --> 03:16:22,341 of functions from the CS50 library. 3429 03:16:22,341 --> 03:16:25,431 And if I want to get\nthe user's response 3430 03:16:25,431 --> 03:16:28,501 the return value should be\na char also on the left. 3431 03:16:28,501 --> 03:16:31,281 So now we've seen strings,\nints, and now chars 3432 03:16:31,281 --> 03:16:33,171 if we only care about a single letter. 3433 03:16:33,171 --> 03:16:38,150 And now let's go ahead,\ncheck whether user agreed. 3434 03:16:38,150 --> 03:16:47,751 So how about if c == "y", then let me\n 3435 03:16:47,751 --> 03:16:51,900 print out agreed or some\nsuch sentence like that. 3436 03:16:51,900 --> 03:16:54,560 else if they did not type\nc-- or you know what? 3437 03:16:54,560 --> 03:16:56,390 Let's be explicit here,\njust so they can't 3438 03:16:56,390 --> 03:16:59,060 type z or b or some random letter. 3439 03:16:59,060 --> 03:17:07,011 else if c=="n" n for no, then let me\n 3440 03:17:07,978 --> 03:17:10,520 And I'm just going to ignore\nthe user if they don't cooperate 3441 03:17:10,521 --> 03:17:14,781 and they type z or b or\nsomething that's not y or n. 3442 03:17:14,781 --> 03:17:21,141 All right, let me go ahead now and\n 3443 03:17:22,570 --> 03:17:25,251 Let's go with the default.\nOK, so that seems to work. 3444 03:17:25,251 --> 03:17:27,201 No, I don't agree this time. 3445 03:17:28,411 --> 03:17:33,921 How about my caps lock key is on or\n 3446 03:17:37,972 --> 03:17:41,181 So, obviously, a bug, at least if I want\n 3447 03:17:41,181 --> 03:17:43,021 which is kind of reasonable. 3448 03:17:43,021 --> 03:17:48,480 So what would be the possible\nsolutions here, do you think? 3449 03:17:48,480 --> 03:17:51,620 How do I solve this and tolerate\nboth capital and lowercase? 3450 03:17:51,620 --> 03:17:54,335 Maybe what's the simplest,\nmost naive implementation? 3451 03:17:56,490 --> 03:17:58,990 DAVID J. MALAN: Yeah, so why\ndon't I just ask two questions? 3452 03:17:58,990 --> 03:18:03,550 Or you know what, even more simplistic\n 3453 03:18:03,550 --> 03:18:06,911 if you will, let me just copy\nand paste some of this code. 3454 03:18:06,911 --> 03:18:11,781 Change this to an else-- whoops,\nnot in caps-- else if "Y". 3455 03:18:11,781 --> 03:18:13,781 And then I bet I could\ndo the same thing with n. 3456 03:18:13,781 --> 03:18:15,641 But here too, just like\nwith Scratch, as soon 3457 03:18:15,640 --> 03:18:17,724 as you start to find\nyourself copying and pasting 3458 03:18:17,724 --> 03:18:19,480 you're probably doing something wrong. 3459 03:18:19,480 --> 03:18:22,810 And what you said verbally,\nif I may, was actually better. 3460 03:18:22,810 --> 03:18:29,501 Because you're implying that I could\n 3461 03:18:32,841 --> 03:18:39,711 The catch is, you can't use the word OR\n 3462 03:18:39,710 --> 03:18:43,180 So you can express one\nquestion or another. 3463 03:18:43,181 --> 03:18:46,301 You only need one of the\nanswers to be yes or true 3464 03:18:46,300 --> 03:18:48,220 and you use two vertical bars. 3465 03:18:48,220 --> 03:18:50,980 By contrast, just so\nyou've seen it, if you 3466 03:18:50,980 --> 03:18:54,341 wanted to check if something is equal\n 3467 03:18:54,341 --> 03:18:56,890 you could use two ampersands. 3468 03:18:56,890 --> 03:18:59,411 This logically would make\nno sense here, though. 3469 03:18:59,411 --> 03:19:02,650 Certainly, what the human typed can't\n 3470 03:19:03,921 --> 03:19:05,711 So in this case, we do want OR. 3471 03:19:05,710 --> 03:19:07,475 But that allows me to\ntighten my code up. 3472 03:19:07,476 --> 03:19:09,851 I don't have to start copying\nand pasting whole branches. 3473 03:19:09,851 --> 03:19:13,560 I can now ask two questions at once. 3474 03:19:13,560 --> 03:19:16,870 Questions, then, on this variation? 3475 03:19:17,869 --> 03:19:19,661 Can you convert the\ninput to all lowercase? 3476 03:19:20,831 --> 03:19:22,480 We don't have the capability yet. 3477 03:19:22,480 --> 03:19:24,940 It turns out that's going to require-- 3478 03:19:24,941 --> 03:19:27,166 to be easy, another library,\nthough we could do it 3479 03:19:27,165 --> 03:19:30,040 ourselves knowing a little bit about\n 3480 03:19:30,040 --> 03:19:34,113 But, yes, that would be an alternative,\n 3481 03:19:37,140 --> 03:19:38,390 DAVID J. MALAN: Good question. 3482 03:19:38,390 --> 03:19:41,987 Unfortunately, you have to be explicit\n 3483 03:19:41,987 --> 03:19:44,320 even though that's kind of\nhow you might think about it. 3484 03:19:44,320 --> 03:19:49,240 You have to ask a complete question\n 3485 03:19:50,230 --> 03:19:51,940 Let me ask a question now too. 3486 03:19:53,050 --> 03:19:57,760 I deliberately used single quotes\n 3487 03:19:59,351 --> 03:20:03,521 Previously, we used double quotes\n 3488 03:20:05,911 --> 03:20:08,970 Correct, string is double quotes for\n 3489 03:20:10,050 --> 03:20:14,210 And single quotes for single characters. 3490 03:20:14,210 --> 03:20:15,800 Because my data type is different. 3491 03:20:15,800 --> 03:20:18,770 I chose the simple route of\njust using a single char. 3492 03:20:18,771 --> 03:20:21,351 In fact, this program\nwon't work with Y-E-S 3493 03:20:21,351 --> 03:20:25,041 or N-O. That's not supported at the\n 3494 03:20:25,040 --> 03:20:27,831 I had to use single quotes\nbecause that's how C does it. 3495 03:20:27,831 --> 03:20:29,790 If you're dealing with\nsingle characters 3496 03:20:29,790 --> 03:20:31,850 a.k.a. chars, use single quotes. 3497 03:20:32,781 --> 03:20:36,591 even if it's one single\ncharacter in a string 3498 03:20:36,591 --> 03:20:39,623 as though you're starting to write\n 3499 03:20:39,623 --> 03:20:40,790 that would be double quotes. 3500 03:20:40,790 --> 03:20:42,860 And we'll see why this\nis before long too. 3501 03:20:42,861 --> 03:20:46,551 But, again, just things to keep\nin mind whenever writing code 3502 03:20:46,550 --> 03:20:48,740 in this particular language. 3503 03:20:50,970 --> 03:20:56,060 So, short answer, if I'm understanding\n 3504 03:20:56,060 --> 03:20:58,270 And this would be even more incorrect. 3505 03:20:58,271 --> 03:21:00,771 But if you don't mind, let me\nkick the can a couple of weeks 3506 03:21:00,771 --> 03:21:02,480 on this as to why this doesn't work. 3507 03:21:02,480 --> 03:21:06,852 The most pleasant way to do this would\n 3508 03:21:06,852 --> 03:21:08,810 But even this is a slippery\nslope, because what 3509 03:21:08,810 --> 03:21:12,179 if the user does something weird,\n 3510 03:21:12,179 --> 03:21:13,970 You can imagine this\ngetting messy quickly. 3511 03:21:13,970 --> 03:21:16,310 I like your idea earlier\nabout just forcing everything 3512 03:21:16,310 --> 03:21:18,560 to lowercase just to standardize things. 3513 03:21:18,560 --> 03:21:22,700 Unfortunately, you cannot compare\nstrings for equality like this 3514 03:21:22,700 --> 03:21:24,780 for, again, reasons will\ncome to before long. 3515 03:21:24,781 --> 03:21:27,681 So for today, we're keeping it\n 3516 03:21:27,681 --> 03:21:31,640 not nearly as user-friendly to\nonly tolerate individual letters. 3517 03:21:31,640 --> 03:21:34,820 And there's a question over here. 3518 03:21:34,820 --> 03:21:36,980 On the US English keyboard\nit's shift and then 3519 03:21:36,980 --> 03:21:40,081 the backslash key above Return,\nbut depending on your keyboard 3520 03:21:42,320 --> 03:21:45,140 All right, so let's\nactually now look back 3521 03:21:45,140 --> 03:21:47,181 at something we did a\nlittle bit of last week. 3522 03:21:47,181 --> 03:21:49,971 Let me go ahead and open\na file called meow.c 3523 03:21:49,970 --> 03:21:52,460 because, recall, that's what\nwe had Scratch do initially. 3524 03:21:52,460 --> 03:21:55,251 Let me include not the\nC50 library this time 3525 03:21:55,251 --> 03:21:59,150 but just stdio.h because I\nonly want printf for this demo. 3526 03:21:59,150 --> 03:22:02,720 Let me go ahead now and\njust print out meow. 3527 03:22:02,720 --> 03:22:06,411 And then if I want the cat to meow\n 3528 03:22:11,210 --> 03:22:14,120 The program is written--\ncorrect, I claim. 3529 03:22:15,591 --> 03:22:17,841 But, again, this was the\nbeginning of our conversation 3530 03:22:17,841 --> 03:22:20,541 last week of not being\nparticularly well-designed. 3531 03:22:20,540 --> 03:22:23,630 And if someone wants to maybe\npoint out the now obvious 3532 03:22:23,630 --> 03:22:28,380 why is this not\nwell-designed, necessarily? 3533 03:22:28,380 --> 03:22:29,820 Yeah, it's just repetition, right? 3534 03:22:29,820 --> 03:22:31,894 Again, I literally\nresorted to copy-paste. 3535 03:22:31,894 --> 03:22:33,810 That should be the signal\nthat you're probably 3536 03:22:33,810 --> 03:22:37,870 doing something wrong or, at best,\n 3537 03:22:37,870 --> 03:22:40,195 So the solution, as you\nmight glean from last week 3538 03:22:40,195 --> 03:22:42,570 is probably going to be one\nof those things called loops. 3539 03:22:42,570 --> 03:22:45,271 So let's just take a look at some\nof the syntax for loops in C. 3540 03:22:45,271 --> 03:22:48,084 But, again, no new ideas,\nit's just some new syntax 3541 03:22:48,084 --> 03:22:49,501 that'll take some getting used to. 3542 03:22:49,501 --> 03:22:53,130 In Scratch, if you wanted to meow\n 3543 03:22:53,130 --> 03:22:57,880 there's not a forever keyword in C, so\n 3544 03:22:57,880 --> 03:22:59,220 But this is the best we can do. 3545 03:22:59,220 --> 03:23:03,331 It turns out there is a\nkeyword called while in C. 3546 03:23:03,331 --> 03:23:05,550 And that kind of has\nthe right semantics 3547 03:23:05,550 --> 03:23:08,740 because it's like while I do\nsomething again and again 3548 03:23:10,230 --> 03:23:14,790 But just like an if condition\nor an else if condition 3549 03:23:14,790 --> 03:23:18,060 those took a Boolean\nexpression in parentheses 3550 03:23:18,060 --> 03:23:21,130 a while loop also takes a Boolean\nexpression in parentheses. 3551 03:23:22,531 --> 03:23:26,041 Now, if I want to do something\n 3552 03:23:26,040 --> 03:23:30,540 say while 2 is greater than\n1, while 3 is greater than 2 3553 03:23:30,540 --> 03:23:32,430 or just something completely arbitrary. 3554 03:23:32,431 --> 03:23:36,240 But that should rub you the wrong\n 3555 03:23:36,240 --> 03:23:41,130 Why 3-- if you want true, just say true. 3556 03:23:41,130 --> 03:23:45,780 So it turns out in C, there are\n 3557 03:23:45,781 --> 03:23:48,901 that are literally true\nand false, respectively. 3558 03:23:48,900 --> 03:23:53,341 I could also put the number 1 for\n 3559 03:23:53,341 --> 03:23:56,081 but most people would just\nsay true to be explicit. 3560 03:23:56,081 --> 03:23:59,431 So it's a little hackish, if\nyou will, but very conventional. 3561 03:23:59,431 --> 03:24:03,781 There's no forever keyword in C. If\n 3562 03:24:03,781 --> 03:24:06,191 I'm going to just use\nsomething like printf here. 3563 03:24:06,191 --> 03:24:08,731 So, again, not perfect\ntranslation from one 3564 03:24:08,730 --> 03:24:11,708 to the other, but absolutely\npossible in C. What about this? 3565 03:24:11,708 --> 03:24:14,041 This is a little more common\nif you want to do something 3566 03:24:14,040 --> 03:24:17,280 a finite number of times, like repeat 3. 3567 03:24:17,281 --> 03:24:21,901 There's a few different ways we can\n 3568 03:24:21,900 --> 03:24:25,470 And here's where C-- like a\nlot of text-based languages 3569 03:24:25,470 --> 03:24:28,921 you kind of have to whip out that\n 3570 03:24:28,921 --> 03:24:31,171 blocks and think about,\nall right, how can I 3571 03:24:31,171 --> 03:24:35,880 build a little machine in software that\n 3572 03:24:35,880 --> 03:24:40,380 Well, let me give myself a variable\n 3573 03:24:40,380 --> 03:24:46,740 Let me create a loop whose Boolean\n 3574 03:24:46,740 --> 03:24:50,310 the idea being here, why don't\nI just kind of count 1, 2, 3? 3575 03:24:50,310 --> 03:24:53,790 So how do I implement\nthis physicality in code? 3576 03:24:53,790 --> 03:24:57,180 I give myself a variable,\nset it to 0, 0 fingers up. 3577 03:24:57,181 --> 03:25:00,001 Now, I ask the question,\nis counter less than 3? 3578 03:25:00,001 --> 03:25:02,970 If so, go ahead and print out meow. 3579 03:25:02,970 --> 03:25:06,331 And just intuitively, even if\n 3580 03:25:06,331 --> 03:25:09,480 before Scratch, what\nmore do I need to do? 3581 03:25:09,480 --> 03:25:12,421 I've left room here for\none more line of logic. 3582 03:25:13,990 --> 03:25:15,220 We have to increase counter. 3583 03:25:15,220 --> 03:25:19,150 So I need code like I showed earlier,\n 3584 03:25:19,150 --> 03:25:21,400 And so here's where\nprogramming sometimes 3585 03:25:21,400 --> 03:25:22,921 becomes a bit more like plumbing. 3586 03:25:22,921 --> 03:25:25,421 You can't just say what you\nmean, like you couldn't Scratch. 3587 03:25:25,421 --> 03:25:28,001 You have to build a little\nsort of software machine 3588 03:25:28,001 --> 03:25:31,181 that initializes a value, does\n 3589 03:25:31,181 --> 03:25:34,091 And so it's kind of like\nthis software-based machine 3590 03:25:34,091 --> 03:25:37,211 but together, that's just using\nsome familiar building blocks. 3591 03:25:38,380 --> 03:25:40,672 Just like in Scratch, you\nmight have used loops a bunch 3592 03:25:40,673 --> 03:25:42,161 of times, pretty common in C. 3593 03:25:42,161 --> 03:25:44,050 So can we tighten this code up? 3594 03:25:44,050 --> 03:25:48,581 This is correct, but here are\nsome conventions that are popular. 3595 03:25:48,581 --> 03:25:51,025 If you're going to count, just say i. 3596 03:25:51,025 --> 03:25:52,900 A convention in\nprogramming-- with, at least 3597 03:25:52,900 --> 03:25:57,490 languages like C-- is just use i\n 3598 03:25:57,490 --> 03:25:59,650 is to count from like, 0 on up. 3599 03:26:01,900 --> 03:26:04,780 It's just more verbose\nthan you need to be. 3600 03:26:05,740 --> 03:26:07,532 You don't need more semantics than that. 3601 03:26:07,532 --> 03:26:08,990 All right, what else can I do here? 3602 03:26:08,990 --> 03:26:12,177 There's another opportunity\nto tighten up this code. 3603 03:26:14,720 --> 03:26:17,720 Yeah, that syntactic sugar\nthat does nothing new 3604 03:26:17,720 --> 03:26:19,520 but it does it more succinctly. 3605 03:26:19,521 --> 03:26:23,990 I can change this to either the\n 3606 03:26:25,700 --> 03:26:27,920 Now, this is pretty canonical. 3607 03:26:27,921 --> 03:26:31,941 This is how most people\nwould implement something 3608 03:26:31,941 --> 03:26:34,611 three times using a loop in C-- 3609 03:26:34,611 --> 03:26:36,336 using a while loop, that is. 3610 03:26:36,335 --> 03:26:39,350 Turns out that it's so common\nin C and other languages 3611 03:26:39,351 --> 03:26:43,161 to do something finitely many times,\n 3612 03:26:43,161 --> 03:26:46,021 In this model, to be\nclear, the logic, though 3613 03:26:46,021 --> 03:26:49,221 is that we start by initializing the\n 3614 03:26:49,220 --> 03:26:52,310 We then ask the question,\nis i less than 0? 3615 03:26:52,310 --> 03:26:56,030 If so, everything that's\nindented inside the curly braces 3616 03:26:56,031 --> 03:26:59,240 gets executed-- namely,\nmeow then the update. 3617 03:26:59,240 --> 03:27:02,931 Then the computer is going to\nhave to recheck the condition 3618 03:27:02,931 --> 03:27:06,531 to make sure that i hasn't gotten\n 3619 03:27:06,531 --> 03:27:09,621 But if not, it then does this\nagain and it does this again. 3620 03:27:09,620 --> 03:27:12,045 And then it repeats, constantly\nchecking the condition 3621 03:27:12,046 --> 03:27:14,421 and executing what's in the\nblock, checking the condition 3622 03:27:14,421 --> 03:27:15,837 and executing what's in the block. 3623 03:27:15,837 --> 03:27:20,390 After three times of that, the condition\n 3624 03:27:21,620 --> 03:27:24,740 It just proceeds to whatever's\n 3625 03:27:24,740 --> 03:27:27,390 It jumps to the next blocks down below. 3626 03:27:27,390 --> 03:27:30,181 All right, what's another\nway, though, to do this? 3627 03:27:30,181 --> 03:27:32,181 Well, I've deliberately\nbeen counting from 0-- 3628 03:27:32,181 --> 03:27:33,972 and that's a programming\nconvention, right? 3629 03:27:33,972 --> 03:27:36,688 We started last week with all\nthe light bulbs off, which was 0. 3630 03:27:36,688 --> 03:27:39,021 So it's pretty reasonable to\nstart counting at 0's, just 3631 03:27:39,853 --> 03:27:41,870 Like, no fingers are up, this is 0-- 3632 03:27:43,560 --> 03:27:47,841 But if you prefer, you could\nstart counting at i equals 1. 3633 03:27:47,841 --> 03:27:51,140 But then you don't want to\ndo it while i is less than 3 3634 03:27:51,140 --> 03:27:54,200 you want to do i is\nless than or equal to 3. 3635 03:27:54,200 --> 03:27:58,640 On most keyboards, there's no symbol for\n 3636 03:27:58,640 --> 03:28:02,690 or equal to, so in C, you\nuse two characters, less than 3637 03:28:02,691 --> 03:28:05,811 and then an equals sign\nwith no spaces in between. 3638 03:28:05,810 --> 03:28:08,251 That just means less than or equal to. 3639 03:28:08,251 --> 03:28:12,771 We could change it to set i to 2\n 3640 03:28:13,820 --> 03:28:18,900 We could make this be a 10\nand less than or equal to 12. 3641 03:28:18,900 --> 03:28:20,810 But, again, just stick with the basics. 3642 03:28:20,810 --> 03:28:23,931 Start at 0 and count on up\nwould be the convention. 3643 03:28:23,931 --> 03:28:27,291 Or if you prefer to count\ndown, that's fine too. 3644 03:28:27,290 --> 03:28:31,700 Set i to 3 and then do this so\nlong as i is greater than 0 3645 03:28:31,700 --> 03:28:34,552 but you have to decrement\ninstead of increment. 3646 03:28:34,552 --> 03:28:36,261 So, again, we could\ndo this all day long. 3647 03:28:36,261 --> 03:28:39,396 There's literally an infinite number\n 3648 03:28:39,396 --> 03:28:41,271 And that's why I keep\nemphasizing convention. 3649 03:28:41,271 --> 03:28:43,761 Call the variable i for\nsomething like this 3650 03:28:43,761 --> 03:28:47,271 initialize it to 0 for something like\n 3651 03:28:47,271 --> 03:28:49,161 unless you really prefer to count down. 3652 03:28:49,161 --> 03:28:51,921 Again, just certain human conventions. 3653 03:28:51,921 --> 03:28:55,111 All right, how about\nanother way to do this? 3654 03:28:55,111 --> 03:28:59,070 This is what's called a for\nloop in C, also very common. 3655 03:28:59,070 --> 03:29:02,091 It's not quite as straightforward\n 3656 03:29:02,091 --> 03:29:04,011 to bottom in exactly the same way. 3657 03:29:04,011 --> 03:29:07,400 This kind of has a lot more\nlogic tucked into its first line. 3658 03:29:07,400 --> 03:29:09,921 But it does exactly the same thing. 3659 03:29:11,630 --> 03:29:15,350 notice that inside the\nparentheses, next to the word for 3660 03:29:15,351 --> 03:29:18,523 there's two semicolons-- which\nis another weird use of syntax. 3661 03:29:18,522 --> 03:29:20,480 They're not at the end\nof the line, now they're 3662 03:29:20,480 --> 03:29:21,855 in the middle of the parentheses. 3663 03:29:21,855 --> 03:29:24,030 But that's what the\nhumans chose years ago. 3664 03:29:24,031 --> 03:29:30,951 The first thing before the semicolons\n 3665 03:29:30,950 --> 03:29:34,040 The next thing is the condition\nthat's going to constantly get 3666 03:29:34,040 --> 03:29:36,560 checked every cycle through this loop. 3667 03:29:36,560 --> 03:29:41,360 And the last thing is going to be\n 3668 03:29:41,361 --> 03:29:42,990 in this case is going to be count up. 3669 03:29:42,990 --> 03:29:45,710 So, again, if I rewind\nwe initialize i to 0. 3670 03:29:45,710 --> 03:29:48,350 We then ask the question,\nis i less than 3? 3671 03:29:48,351 --> 03:29:52,611 If so, execute what's\ninside of the loop. 3672 03:29:52,611 --> 03:29:58,490 Then the computer does this, it does\n 3673 03:29:58,490 --> 03:30:01,220 And then it's not going\nto blindly meow again. 3674 03:30:01,220 --> 03:30:04,581 It's going to check again the\ncondition, is i less than 3? 3675 03:30:04,581 --> 03:30:06,351 Then it's going to meow if so. 3676 03:30:06,351 --> 03:30:10,761 Then it might go ahead and increment\n 3677 03:30:10,761 --> 03:30:14,449 So, again, this does not read quite\n 3678 03:30:14,449 --> 03:30:16,740 You kind of read it left to\nright and then jump around. 3679 03:30:16,740 --> 03:30:21,890 But, again, the initialization,\nthe constant Boolean expression 3680 03:30:21,890 --> 03:30:24,681 being checked, and the\nupdate after each time 3681 03:30:24,681 --> 03:30:32,551 does the exact same thing as what we saw\n 3682 03:30:35,011 --> 03:30:37,052 I think most people would\nprobably eventually use 3683 03:30:37,052 --> 03:30:42,570 a for loop once comfortable, but just\n 3684 03:30:42,570 --> 03:30:45,671 All right, any questions, then, on\n 3685 03:30:47,923 --> 03:30:49,631 DAVID J. MALAN: A for\nloop and while loop 3686 03:30:49,630 --> 03:30:53,140 can both be used to do\nexactly the same thing. 3687 03:30:53,140 --> 03:30:56,733 There are subtle differences\nwith issues of scope 3688 03:30:56,733 --> 03:30:58,691 which we'll discuss before\nlong, where when you 3689 03:30:58,691 --> 03:31:00,921 create a variable in a for loop-- 3690 03:31:00,921 --> 03:31:04,390 notice that it was, again, inside\nof those parentheses, which 3691 03:31:04,390 --> 03:31:08,831 technically means it's only going to\n 3692 03:31:08,831 --> 03:31:12,581 By contrast, with the while loop,\nI declared my variable outside 3693 03:31:13,210 --> 03:31:17,030 That variable is going to continue\n 3694 03:31:17,031 --> 03:31:20,008 So that's one of the\nminor differences there. 3695 03:31:20,591 --> 03:31:22,871 But you'll see some others over time. 3696 03:31:22,870 --> 03:31:26,090 All right, so we claim then\nthat it's better in some form 3697 03:31:27,290 --> 03:31:29,331 So let's actually jump back to the code. 3698 03:31:29,331 --> 03:31:33,831 Let me go ahead and now re-implement\n 3699 03:31:33,831 --> 03:31:39,701 So how about for int i\n= 0, i less than 3, i++. 3700 03:31:39,700 --> 03:31:44,440 Then inside my curly braces, let me go\n 3701 03:31:44,441 --> 03:31:46,911 with a newline and a semicolon. 3702 03:31:46,911 --> 03:31:50,661 So I did it pretty quickly just because\n 3703 03:31:50,661 --> 03:31:53,591 But if I now make meow, no errors there. 3704 03:31:57,101 --> 03:31:59,230 Well, let's do now what\nwe did last week, which 3705 03:31:59,230 --> 03:32:03,501 was to begin to make our own\ncustom functions, if you will 3706 03:32:03,501 --> 03:32:09,431 by using our own in C. So here's\n 3707 03:32:09,431 --> 03:32:13,841 but we'll explain over time what\n 3708 03:32:13,841 --> 03:32:17,201 If I want to create a\nfunction called meow-- 3709 03:32:17,200 --> 03:32:21,280 because the authors of C did not create\n 3710 03:32:21,281 --> 03:32:23,951 I need to give it a name, like meow. 3711 03:32:23,950 --> 03:32:26,530 I need to specify if\nit takes any inputs. 3712 03:32:26,531 --> 03:32:28,391 For now, I'm going to say no. 3713 03:32:28,390 --> 03:32:34,200 And I'm going to explicitly say no\n 3714 03:32:34,200 --> 03:32:37,080 It's also necessary when\nimplementing a function in C-- 3715 03:32:37,081 --> 03:32:38,851 which was not necessary in Scratch-- 3716 03:32:38,851 --> 03:32:41,411 to specify what its return type is. 3717 03:32:41,411 --> 03:32:45,060 But for now, I'm just going to say\n 3718 03:32:46,591 --> 03:32:49,531 and that's what the void\nin parentheses means-- 3719 03:32:49,531 --> 03:32:54,091 and it does not return\nanything like ask did 3720 03:32:54,091 --> 03:32:56,220 or like get_string or get_int does. 3721 03:32:56,220 --> 03:32:59,940 meow's purpose in life is just to\n 3722 03:32:59,941 --> 03:33:01,990 by printing something on the screen. 3723 03:33:01,990 --> 03:33:04,261 So what is meow going to do? 3724 03:33:04,261 --> 03:33:06,480 I'm going to have it\nquite simply say printf 3725 03:33:06,480 --> 03:33:10,290 quote unquote, "meow", backslash n. 3726 03:33:10,290 --> 03:33:14,670 And now, just like in\nScratch, I can now just call 3727 03:33:14,671 --> 03:33:16,831 a brand new function called meow. 3728 03:33:16,831 --> 03:33:19,531 And here's where too, if you\nreally don't like the curly braces 3729 03:33:19,531 --> 03:33:22,591 technically speaking, you can\nget rid of them when there's 3730 03:33:22,591 --> 03:33:24,810 only one line of code inside your loop. 3731 03:33:24,810 --> 03:33:27,511 But, again, stylistically,\nI would encourage 3732 03:33:27,511 --> 03:33:30,511 you to preserve them to make\nsuper clear to yourself and others 3733 03:33:32,470 --> 03:33:35,100 Let me go ahead and save\nthis and do make meow. 3734 03:33:39,970 --> 03:33:42,140 DAVID J. MALAN: Yeah, so\n0 does not belong there. 3735 03:33:49,601 --> 03:33:51,621 All right, it's still working OK. 3736 03:33:51,620 --> 03:33:54,770 But recall what I did in Scratch,\n 3737 03:33:54,771 --> 03:33:57,820 And just to make a point, let me\njust highlight this and move it 3738 03:33:59,261 --> 03:34:02,291 Because, again, now that meow\nexists, it's an abstraction. 3739 03:34:02,290 --> 03:34:04,600 I just know a meow function exists. 3740 03:34:04,601 --> 03:34:06,111 I want to be able to use it. 3741 03:34:08,021 --> 03:34:09,730 My main function is the same. 3742 03:34:09,730 --> 03:34:12,280 Let me go ahead and make meow again. 3743 03:34:12,281 --> 03:34:17,531 And now, just by moving that function,\n 3744 03:34:17,531 --> 03:34:18,851 And let's look at the first. 3745 03:34:18,851 --> 03:34:21,018 Again, the rule of thumb\nhere-- it's a little small 3746 03:34:21,018 --> 03:34:25,060 but it says meow.c in bold-- which is\n 3747 03:34:25,060 --> 03:34:27,911 5 is the line number,\nand 20 is the character. 3748 03:34:27,911 --> 03:34:30,740 So line number is enough alone. 3749 03:34:32,501 --> 03:34:36,611 Oh, this is what happens\nwhen I scrolled up too far. 3750 03:34:37,210 --> 03:34:39,760 This is the error we're\nnow looking at, line 7. 3751 03:34:39,761 --> 03:34:43,691 I was looking at the old error message\n 3752 03:34:45,550 --> 03:34:49,780 All right, apparently, C does not\n 3753 03:34:49,781 --> 03:34:53,441 Implicit declaration of\nfunction meow is invalid in C99. 3754 03:34:54,581 --> 03:34:57,941 Declaration of function means\nyour creation of a function. 3755 03:34:57,941 --> 03:35:01,511 Like, I'm declaring that meow\nexists, but I haven't apparently 3756 03:35:02,501 --> 03:35:06,230 And then C99 is the version\nof C from the year 1999 3757 03:35:06,230 --> 03:35:09,190 which we generally use here, it's\n 3758 03:35:12,460 --> 03:35:16,030 Can you infer from the mere fact\n 3759 03:35:16,031 --> 03:35:19,041 of the file-- which was fine\nin Scratch but now is bad-- 3760 03:35:21,865 --> 03:35:23,990 DAVID J. MALAN: Yeah, C is\njust kind of old school. 3761 03:35:23,990 --> 03:35:25,681 It reads your code top to bottom. 3762 03:35:25,681 --> 03:35:30,001 And if it does not know what meow\n 3763 03:35:30,001 --> 03:35:32,911 it just freaks out and prints\nout these error messages. 3764 03:35:32,911 --> 03:35:38,511 So the solution is, quite simply, don't\n 3765 03:35:38,511 --> 03:35:42,470 But you can imagine this getting a\n 3766 03:35:42,470 --> 03:35:46,761 because main is, by name, the\nmain part of your program. 3767 03:35:46,761 --> 03:35:49,970 And, honestly, it would just\nbe nice if main were always 3768 03:35:51,099 --> 03:35:53,390 Because if you want to\nunderstand what a file is doing 3769 03:35:53,390 --> 03:35:55,400 it makes sense to just\nread it top to bottom. 3770 03:35:55,400 --> 03:35:57,575 Well, there is a solution to this. 3771 03:35:57,575 --> 03:36:02,720 You can put functions in different\n 3772 03:36:02,720 --> 03:36:07,501 as you-- and this is perhaps the\n 3773 03:36:07,501 --> 03:36:10,671 so long as you leave a little\nbreadcrumb for the compiler 3774 03:36:10,671 --> 03:36:13,130 at the very top of your\nfile that literally 3775 03:36:13,130 --> 03:36:17,090 repeats the return value,\nthe name, and the arguments 3776 03:36:17,091 --> 03:36:19,681 to that function, semicolon. 3777 03:36:19,681 --> 03:36:22,791 This is, so to speak,\ndeclaring your function-- 3778 03:36:22,790 --> 03:36:25,260 and the real fancy way\nis this is a prototype. 3779 03:36:25,261 --> 03:36:27,541 It's like, what is this\nthing going to look like? 3780 03:36:27,540 --> 03:36:30,480 But the semicolon means I'm not\ngoing to deal with this yet. 3781 03:36:30,480 --> 03:36:32,331 I'm going to actually\ndefine the function 3782 03:36:32,331 --> 03:36:34,701 or implement it down below here. 3783 03:36:34,700 --> 03:36:36,680 This is kind of a stupid detail. 3784 03:36:36,681 --> 03:36:40,221 More recent languages\nget rid of this need 3785 03:36:40,220 --> 03:36:41,993 you can put your functions in any order. 3786 03:36:41,994 --> 03:36:43,911 But, again, if you just\nthink about the basics 3787 03:36:43,911 --> 03:36:46,230 of programming languages\nlike this one here-- 3788 03:36:47,210 --> 03:36:49,350 it must just be reading\nyour code top to bottom. 3789 03:36:49,351 --> 03:36:53,060 So annoying, yes, but\nexplained, yes too. 3790 03:36:53,060 --> 03:36:58,280 So let me go ahead and make meow one\n 3791 03:36:58,281 --> 03:37:02,331 And let me make one final enhancement\nto this meow program here. 3792 03:37:02,331 --> 03:37:05,220 Let me go ahead now and\nsay something like this. 3793 03:37:05,220 --> 03:37:07,190 Let me go ahead and say,\nall right, wouldn't it 3794 03:37:07,191 --> 03:37:12,941 be nice if my meow function could do\n 3795 03:37:12,941 --> 03:37:14,701 So suppose I want to do this. 3796 03:37:14,700 --> 03:37:17,870 This meow function at the moment\nis going to meow three times. 3797 03:37:17,870 --> 03:37:21,110 But suppose I want to meow\nn times, where n is just 3798 03:37:21,111 --> 03:37:23,181 some number provided by the user. 3799 03:37:23,181 --> 03:37:27,560 Well, just like in Scratch,\ncustom functions can take inputs 3800 03:37:27,560 --> 03:37:30,120 I just presently am saying void. 3801 03:37:30,120 --> 03:37:34,550 But if I change this to int n,\nthereby telling the compiler 3802 03:37:34,550 --> 03:37:38,150 hey, meow still doesn't\nreturn something 3803 03:37:38,150 --> 03:37:40,581 but it does take something as input. 3804 03:37:40,581 --> 03:37:43,581 It takes an integer,\nand I want to call it n. 3805 03:37:43,581 --> 03:37:46,070 So this is another way\nof declaring a variable 3806 03:37:46,070 --> 03:37:48,890 but a way of declaring a\nvariable that gets handed into 3807 03:37:50,431 --> 03:37:55,101 So now if I tighten up main here, now\n 3808 03:37:55,101 --> 03:37:58,551 just like in Scratch, which is this. 3809 03:37:58,550 --> 03:38:01,040 If I now look at this\ncode-- let me Zoom in here-- 3810 03:38:01,040 --> 03:38:04,310 now my main program is really\nwell-written in the sense 3811 03:38:04,310 --> 03:38:07,190 that it just says what it\ndoes, meow three times. 3812 03:38:07,191 --> 03:38:11,031 This works, though, because I\n 3813 03:38:11,031 --> 03:38:17,691 an integer called n, and then using\n 3814 03:38:18,771 --> 03:38:21,271 You might have caught my one mistake. 3815 03:38:21,271 --> 03:38:25,101 I also have to remind myself up\nhere to make that change too. 3816 03:38:25,101 --> 03:38:27,891 Again, this is one of the only\nredundancies or copy-paste 3817 03:38:29,511 --> 03:38:32,031 But there, I have now a better version. 3818 03:38:32,031 --> 03:38:36,381 So let me go ahead and rerun\nthis, make meow, ./meow. 3819 03:38:36,980 --> 03:38:39,530 So, again, no change\nin correctness but now 3820 03:38:39,531 --> 03:38:41,391 again, we're sort of\nmodularizing our code. 3821 03:38:41,390 --> 03:38:44,720 And, heck, what you could do now-- and\n 3822 03:38:45,501 --> 03:38:48,320 those header files we talked\nabout early, those libraries 3823 03:38:48,320 --> 03:38:51,050 this is the kind of modularization\nwe're talking about. 3824 03:38:51,050 --> 03:38:54,921 We, the staff, wrote a function called\n 3825 03:38:54,921 --> 03:39:01,191 we put it in a file called CS50, and we\n 3826 03:39:01,191 --> 03:39:03,381 these things called prototypes-- 3827 03:39:05,570 --> 03:39:10,251 So that when you all, as aspiring\nprogrammers, include cs50.h 3828 03:39:10,251 --> 03:39:14,300 you are sort of secretly telling the\n 3829 03:39:14,300 --> 03:39:16,290 what the menu of available functions is. 3830 03:39:16,790 --> 03:39:21,560 Because in CS50 is lines like\nthese-- obviously, not for meow 3831 03:39:21,560 --> 03:39:24,140 but for get_string,\nget_int, and so forth. 3832 03:39:24,140 --> 03:39:29,540 And stdio.h is the same lines\nof code for things like printf. 3833 03:39:29,540 --> 03:39:31,740 So that's all that's going on there. 3834 03:39:31,740 --> 03:39:38,040 It's just a way of telling the computer\n 3835 03:39:38,040 --> 03:39:40,920 All right, any questions,\nthen, on these here? 3836 03:39:44,310 --> 03:39:47,130 So if you don't mind, I\nwant to continue to wave 3837 03:39:47,130 --> 03:39:49,050 my hand at that detail for today. 3838 03:39:49,050 --> 03:39:53,581 Indeed, int main void is a little weird,\n 3839 03:39:53,581 --> 03:39:55,667 We have no mechanism\nfor providing input yet. 3840 03:39:55,667 --> 03:39:57,751 And what does it mean for\nmain to return anything? 3841 03:39:57,751 --> 03:39:59,251 Like, who is it returning to? 3842 03:39:59,251 --> 03:40:00,378 For another day, if we may. 3843 03:40:00,378 --> 03:40:02,461 They're going to come into\nplay but that, for now 3844 03:40:02,460 --> 03:40:05,100 today is just something you\nshould take at face value 3845 03:40:05,101 --> 03:40:08,320 as necessary copy-paste\nto begin programs. 3846 03:40:08,320 --> 03:40:11,490 So meow is a function that takes an\n 3847 03:40:11,490 --> 03:40:14,970 but it didn't actually have a\nreturn value, hence the void. 3848 03:40:14,970 --> 03:40:17,520 But what if we actually want\nto create our own function that 3849 03:40:17,521 --> 03:40:20,431 not only takes 0 or\nmore inputs as arguments 3850 03:40:20,431 --> 03:40:24,060 but also returns some value, maybe an\n 3851 03:40:24,990 --> 03:40:27,640 Well, it turns out, in C,\nwe can do that as well. 3852 03:40:27,640 --> 03:40:31,081 Let me go ahead and create a\nnew file here called discount. 3853 03:40:31,081 --> 03:40:33,121 And let's implement a\nquick program via which 3854 03:40:33,120 --> 03:40:35,640 we can discount some regular\nprice by some percentage 3855 03:40:35,640 --> 03:40:37,740 as though there's a sale\ngoing on in a store. 3856 03:40:37,740 --> 03:40:44,490 Let me go ahead and include our usual\n 3857 03:40:44,490 --> 03:40:47,550 Let me give myself int\nmain void as before. 3858 03:40:47,550 --> 03:40:50,380 And inside of main, let's go\nahead and do something simple. 3859 03:40:50,380 --> 03:40:52,470 Let's give ourselves a\nfloat called regular 3860 03:40:52,470 --> 03:40:55,350 representing the regular\nprice of something in a store. 3861 03:40:55,351 --> 03:40:58,111 Let's go ahead and get a float\nfrom the user asking them 3862 03:41:00,480 --> 03:41:04,740 Then, next, let's go ahead and declare\n 3863 03:41:04,740 --> 03:41:07,950 called sale, ultimately\nrepresenting the sale price 3864 03:41:07,950 --> 03:41:09,810 after some percentage discount off. 3865 03:41:09,810 --> 03:41:13,020 And let's go ahead and simply\ncalculate whatever regular is. 3866 03:41:13,021 --> 03:41:15,851 And, say, 15% off is a\npretty good discount. 3867 03:41:15,851 --> 03:41:20,471 So let's go ahead and discount\nregular, whatever it is, by 15% 3868 03:41:20,470 --> 03:41:23,911 which is equivalent, of course, to\n 3869 03:41:25,681 --> 03:41:30,126 Of course, if we're taking off 15%,\n 3870 03:41:30,126 --> 03:41:32,251 Now, let's go ahead and\nprint out the results here. 3871 03:41:32,251 --> 03:41:36,271 Let me go ahead and say\nprintf sale price, colon-- 3872 03:41:36,271 --> 03:41:38,941 let me go ahead and %f,\nbut, more specifically 3873 03:41:38,941 --> 03:41:43,951 %.2f because, at least in US currency\n 3874 03:41:45,751 --> 03:41:48,630 And then let me go ahead and\nplug in the value of sale. 3875 03:41:48,630 --> 03:41:52,230 All right, let's go down here\nand do make discount, Enter. 3876 03:41:52,230 --> 03:41:54,780 So far, so good-- ./discount. 3877 03:41:54,781 --> 03:41:57,001 And the regular price is maybe $100. 3878 03:41:57,001 --> 03:41:59,560 So the sale price should be $85. 3879 03:41:59,560 --> 03:42:01,483 So our arithmetic seems\nto be correct here. 3880 03:42:01,483 --> 03:42:02,941 But let's fast-forward now in time. 3881 03:42:02,941 --> 03:42:04,801 Suppose that we find\nourselves discounting 3882 03:42:04,800 --> 03:42:07,727 a lot of prices in an\napplication, maybe a website 3883 03:42:07,727 --> 03:42:10,560 like Amazon where they're offering\n 3884 03:42:10,560 --> 03:42:13,200 And it'd be nice to have\na reusable function that 3885 03:42:13,200 --> 03:42:16,420 just does this arithmetic for\nus, simple though it may be. 3886 03:42:16,421 --> 03:42:18,511 So let's go ahead and\nmodify discount this time 3887 03:42:18,511 --> 03:42:22,051 to give ourselves our own\nfunction called discount 3888 03:42:22,050 --> 03:42:23,778 for instance, that takes an input-- 3889 03:42:23,778 --> 03:42:25,861 like the regular price\nthat you want to discount-- 3890 03:42:25,861 --> 03:42:28,073 and then it also returns a value. 3891 03:42:28,073 --> 03:42:29,281 It doesn't just print it out. 3892 03:42:29,281 --> 03:42:34,081 It returns a value, namely, a float\n 3893 03:42:34,081 --> 03:42:37,230 So let me go down\nbelow main and go ahead 3894 03:42:37,230 --> 03:42:39,900 and define a function that's\ngoing to return a float 3895 03:42:39,900 --> 03:42:42,118 because we're dealing\nwith dollar amount still. 3896 03:42:42,118 --> 03:42:43,951 The function is going\nto be called discount. 3897 03:42:43,950 --> 03:42:47,970 And it's going to take one input, like\n 3898 03:42:47,970 --> 03:42:50,171 In here, I'm going to do\nsomething very simple. 3899 03:42:50,171 --> 03:42:55,810 I'm going to say float sale equals\n 3900 03:42:55,810 --> 03:42:58,052 And then I'm going to go\nahead and return sale. 3901 03:42:58,052 --> 03:43:00,511 Now, for that matter, I can\nactually tighten this up a bit. 3902 03:43:00,511 --> 03:43:04,081 If I'm only declaring a variable\nto store a value that I'm then 3903 03:43:04,081 --> 03:43:09,191 returning with this keyword return, I\n 3904 03:43:09,191 --> 03:43:11,041 So I can delete the second line. 3905 03:43:11,040 --> 03:43:13,680 And I can actually just go ahead\nand get rid of that variable 3906 03:43:13,681 --> 03:43:16,291 altogether and immediately\nreturn whatever the arithmetic 3907 03:43:16,290 --> 03:43:20,161 result is of taking the price input,\n 3908 03:43:21,900 --> 03:43:25,681 So very simple function that\nsimply does the discounting for me. 3909 03:43:25,681 --> 03:43:29,011 As always, let me go\nahead and copy-paste-- 3910 03:43:29,011 --> 03:43:32,251 the only time it's OK to copy-paste--\n 3911 03:43:32,251 --> 03:43:35,341 the top of the file, so that\nwhen compiling this code 3912 03:43:35,341 --> 03:43:38,611 main has already seen\nthe word discount before. 3913 03:43:38,611 --> 03:43:40,271 And now let me go into the code here. 3914 03:43:40,271 --> 03:43:43,411 And instead of doing\nthe math myself in main 3915 03:43:43,411 --> 03:43:46,470 let me presume that we\nhave some function already 3916 03:43:46,470 --> 03:43:50,790 in our toolkit called discount that\n 3917 03:43:52,845 --> 03:43:54,970 And then down here, my code\ndoesn't need to change. 3918 03:43:54,970 --> 03:43:58,200 I'm still going to print out\nsale the variable in which I'm 3919 03:43:58,200 --> 03:44:00,440 storing that result. But\nnotice what I've done here. 3920 03:44:00,441 --> 03:44:02,191 I've sort of abstracted\nthe way the notion 3921 03:44:02,191 --> 03:44:06,091 of taking a discount by creating my\n 3922 03:44:06,091 --> 03:44:07,951 price, or anything else as input. 3923 03:44:07,950 --> 03:44:10,440 It does a little bit of math,\nsimple though it is here 3924 03:44:10,441 --> 03:44:12,011 and then it returns a value. 3925 03:44:12,011 --> 03:44:14,941 But notice that discount\nis not printing that value. 3926 03:44:14,941 --> 03:44:17,191 It's literally using\nthis other keyword called 3927 03:44:17,191 --> 03:44:21,811 return so that I can hand back that\n 3928 03:44:21,810 --> 03:44:25,411 back a value, just like get_int\n 3929 03:44:25,411 --> 03:44:29,851 for you-- so that I up here on\nline 9 can go ahead and store 3930 03:44:29,851 --> 03:44:33,721 that value in a variable if I want\n 3931 03:44:33,720 --> 03:44:38,440 Let me go ahead now and recompile\nthis code with make discount. 3932 03:44:38,441 --> 03:44:40,381 Let me go ahead and do ./discount. 3933 03:44:43,040 --> 03:44:46,530 Sale price is going to be $85 as well. 3934 03:44:46,531 --> 03:44:50,721 Now, it turns out that functions don't\n 3935 03:44:51,261 --> 03:44:53,240 They can actually take 2 or 3 or more. 3936 03:44:53,240 --> 03:44:57,200 So, in fact, suppose we wanted to now\n 3937 03:44:57,200 --> 03:45:01,400 and take in as input to the discount\n 3938 03:45:01,400 --> 03:45:03,890 that I want to discount but\nalso the percentage off 3939 03:45:03,890 --> 03:45:08,150 thereby allowing us to support not just\n 3940 03:45:08,870 --> 03:45:13,610 Well, let me go up here and declare\n 3941 03:45:13,611 --> 03:45:15,951 And let me ask the user\nfor how many percentage 3942 03:45:15,950 --> 03:45:17,810 points they want to take off. 3943 03:45:17,810 --> 03:45:21,501 So I'm going to say percent_off\ninside of the prompt here 3944 03:45:21,501 --> 03:45:23,751 get that int called percent_off. 3945 03:45:23,751 --> 03:45:26,871 And now in addition to\npassing in regular as an input 3946 03:45:26,870 --> 03:45:30,800 to the discount function, I'm\nalso going to pass in percent_off. 3947 03:45:30,800 --> 03:45:34,820 But I need to tell the computer\n 3948 03:45:34,820 --> 03:45:37,130 and the way I do this\nis just with a comma 3949 03:45:37,130 --> 03:45:39,380 down here in the\nfunction's own definition. 3950 03:45:39,380 --> 03:45:43,610 Here is going to be a percentage\nargument, a second argument 3951 03:45:44,511 --> 03:45:50,091 And I'm now going to use that\n 3952 03:45:50,091 --> 03:45:53,480 I don't want to just do percentage\n 3953 03:45:53,480 --> 03:45:56,744 that's going to increase\nthe size of the total price. 3954 03:45:56,744 --> 03:45:59,661 I actually need to do a little bit\n 3955 03:45:59,661 --> 03:46:03,380 a percentage off, like the number\n15 for 15 percentage points 3956 03:46:03,380 --> 03:46:06,590 I need to do 100 minus that\nmany percentage points 3957 03:46:06,591 --> 03:46:08,871 thereby giving me 100 minus 15-- 3958 03:46:09,710 --> 03:46:13,130 And then I need to divide\nthat by 100 in order now 3959 03:46:13,130 --> 03:46:18,030 to give myself 0.85 times\nthe price that was passed in. 3960 03:46:18,031 --> 03:46:22,911 But if I go ahead now and save this,\n 3961 03:46:22,911 --> 03:46:24,890 I notice that I've\nactually got an error here. 3962 03:46:26,300 --> 03:46:28,220 Well, I need to change\nthat prototype too. 3963 03:46:28,220 --> 03:46:30,718 And, again, this is admittedly\nan annoying aspect of C 3964 03:46:30,718 --> 03:46:32,510 that you have to maintain\nconsistency here. 3965 03:46:33,177 --> 03:46:35,390 I'm just going to go up\nhere, change this to int 3966 03:46:35,390 --> 03:46:37,640 percentage-- spelling incorrectly. 3967 03:46:37,640 --> 03:46:40,581 And now let me retry\ncompilation, make discount 3968 03:46:40,581 --> 03:46:42,111 crossing my fingers this time. 3969 03:46:42,111 --> 03:46:46,701 Worked OK. ./discount, and voila, $100. 3970 03:46:46,700 --> 03:46:49,220 And percent off, say, 15 points. 3971 03:46:52,980 --> 03:46:55,310 Now, it's worth noting\nthat I've deliberately 3972 03:46:55,310 --> 03:46:58,400 returned the results of my\nmath from this function. 3973 03:46:58,400 --> 03:47:01,790 I haven't just done the math on the\n 3974 03:47:01,790 --> 03:47:03,950 In fact, if we take a look\nat this second version 3975 03:47:03,950 --> 03:47:07,770 where discount is now taking a price\n 3976 03:47:07,771 --> 03:47:10,041 notice that I'm not doing\nsomething like this. 3977 03:47:10,040 --> 03:47:14,420 I'm not just saying price\nequals price times 100 3978 03:47:14,421 --> 03:47:18,261 minus percentage divided\nby 100 and leaving at that. 3979 03:47:18,261 --> 03:47:21,951 The problem there is that\nthis variable price is going 3980 03:47:21,950 --> 03:47:24,290 to be scoped to that discount function. 3981 03:47:24,290 --> 03:47:27,230 And we'll encounter this again\n 3982 03:47:27,230 --> 03:47:32,091 just refers to where in which a\n 3983 03:47:33,210 --> 03:47:36,200 So it turns out if I change price\n 3984 03:47:36,200 --> 03:47:38,600 function, that's not going\nto have a lasting effect. 3985 03:47:38,601 --> 03:47:40,310 If I actually want to\nget the result back 3986 03:47:40,310 --> 03:47:43,761 to the function that used the\ndiscount function, namely, main 3987 03:47:43,761 --> 03:47:47,001 I actually do need to take this\napproach of actually returning 3988 03:47:47,001 --> 03:47:51,720 the value explicitly so that ultimately\n 3989 03:47:52,220 --> 03:47:54,345 Well, let's go ahead and\nmaybe how about let's just 3990 03:47:54,345 --> 03:47:57,800 use these primitives in\njust a few different ways. 3991 03:47:57,800 --> 03:48:02,300 How about a little game of\nyesteryear, Super Mario Brothers? 3992 03:48:02,300 --> 03:48:05,661 And in the original Super Mario\n 3993 03:48:05,661 --> 03:48:08,060 so you have these\nside-scrolling worlds that 3994 03:48:08,060 --> 03:48:11,470 look like this where there's some coins\n 3995 03:48:11,970 --> 03:48:15,230 So let's just use this as a\nvisual to consider how in C could 3996 03:48:15,230 --> 03:48:17,060 I start to make\nsomething semi-graphical. 3997 03:48:17,060 --> 03:48:20,270 Like, not actual colors or fanciness,\n 3998 03:48:20,271 --> 03:48:23,011 just something like printing\nout some question marks. 3999 03:48:23,011 --> 03:48:26,031 Well, if I go back over here,\nlet me create that actual file 4000 03:48:30,290 --> 03:48:34,190 Let me go ahead and include\nstdio.h, int main void, again 4001 03:48:34,191 --> 03:48:36,320 which we'll continue to\ncopy-paste for today. 4002 03:48:36,320 --> 03:48:40,431 And then let me just go ahead and\n 4003 03:48:41,421 --> 03:48:44,609 All right, this is what we\nmight call ASCII art, which 4004 03:48:44,609 --> 03:48:47,400 just means graphics but really just\n 4005 03:48:47,400 --> 03:48:52,130 And if I make mario and do ./mario,\n 4006 03:48:52,130 --> 03:48:55,911 as this, but it's the beginning\nof this kind of map for a game. 4007 03:48:55,911 --> 03:49:00,050 Well, if I wanted to now print\nout of those things dynamically 4008 03:49:00,050 --> 03:49:01,730 let me go back to my code here. 4009 03:49:01,730 --> 03:49:03,831 And instead of printing\nout for all at once 4010 03:49:03,831 --> 03:49:08,451 I could do something like four int i\n 4011 03:49:08,450 --> 03:49:13,230 And then inside here, I could just\n 4012 03:49:13,230 --> 03:49:15,890 Let me save that, make mario. 4013 03:49:15,890 --> 03:49:20,630 And, at the risk of\ndisappointing, so close 4014 03:49:20,630 --> 03:49:23,800 but I made a mistake,\njust a stupid aesthetic. 4015 03:49:23,800 --> 03:49:25,970 The prompt is not on the new line. 4016 03:49:29,060 --> 03:49:31,810 DAVID J. MALAN: Yeah, I need an\n 4017 03:49:35,540 --> 03:49:37,790 OK, no, because that's going\nto put it after everyone 4018 03:49:37,790 --> 03:49:40,498 and it's going to make this thing\n 4019 03:49:40,498 --> 03:49:44,030 So, logically, just like in Scratch, put\n 4020 03:49:44,751 --> 03:49:48,300 And just print out, for instance,\nonly, quote unquote, new line. 4021 03:49:48,300 --> 03:49:51,350 And now if I do make\nmario again, ./mario, OK. 4022 03:49:52,337 --> 03:49:54,171 But a little better\ndesigned in that now I'm 4023 03:49:54,171 --> 03:49:57,390 not repeating myself multiple times,\n 4024 03:49:57,390 --> 03:50:00,710 But let's do one other\nthing here with mario. 4025 03:50:00,710 --> 03:50:05,960 Let me go ahead and ask the user how\n 4026 03:50:05,960 --> 03:50:09,710 The catch here is that there's another\n 4027 03:50:09,710 --> 03:50:12,170 and it's called a do\nwhile loop, generally. 4028 03:50:12,171 --> 03:50:15,711 A do while loop is\nsimilar to a while loop 4029 03:50:15,710 --> 03:50:19,220 but it checks the condition\nlast instead of first. 4030 03:50:19,220 --> 03:50:21,038 Recall earlier on the\nslide, we had while 4031 03:50:21,039 --> 03:50:22,581 open parenthesis, closed parenthesis. 4032 03:50:22,581 --> 03:50:25,941 And I kept claiming that we check\n 4033 03:50:25,941 --> 03:50:29,070 it was, 3 in advance again and again. 4034 03:50:29,070 --> 03:50:32,630 A do while loop just inverts the\nlogic so that you can actually 4035 03:50:34,040 --> 03:50:36,380 At the top of this program,\nI'm going to go ahead now 4036 03:50:36,380 --> 03:50:40,460 and give myself a variable\nn like this of type integer. 4037 03:50:40,460 --> 03:50:44,870 And then I'm going to do, literally,\n 4038 03:50:44,870 --> 03:50:48,590 n equals get_int-- and I'm going\nto ask the user for the width 4039 03:50:48,591 --> 03:50:51,441 like the number of\ndollar signs to print. 4040 03:50:51,441 --> 03:50:56,031 And I'm going to do this\nwhile n is less than, say, 1. 4041 03:50:56,031 --> 03:50:59,031 So this is a little cryptic,\nbut the salient differences 4042 03:50:59,031 --> 03:51:04,191 are the Boolean expression is now\n 4043 03:51:07,380 --> 03:51:11,390 Well, the difference\nhere if I make mario is-- 4044 03:51:12,890 --> 03:51:16,620 I need to add cs50.h, because\nI'm now using get_int. 4045 03:51:16,620 --> 03:51:22,050 If I now compile this version\nof Mario and do ./mario 4046 03:51:22,050 --> 03:51:26,640 a do while loop is helpful when you want\n 4047 03:51:26,640 --> 03:51:30,931 and then check some condition or some\n 4048 03:51:30,931 --> 03:51:32,461 in this case, the user cooperated. 4049 03:51:32,460 --> 03:51:35,772 It would make no sense if\nthe user typed in, say, 0 4050 03:51:35,772 --> 03:51:37,230 because there's no work to be done. 4051 03:51:37,230 --> 03:51:39,450 It'd be really weird if\nthey said negative 100 4052 03:51:39,450 --> 03:51:41,220 because that makes no sense logically. 4053 03:51:41,220 --> 03:51:46,261 So with this simple construct\nhere, I am doing the following 4054 03:51:48,810 --> 03:51:53,077 The implication is that as soon\n 4055 03:51:53,077 --> 03:51:55,411 I'm going to break out of\nthis loop, and I've got myself 4056 03:51:55,411 --> 03:52:00,630 a variable called n containing,\nessentially, a positive value, 1 4057 03:52:03,210 --> 03:52:07,320 And I can now use this, for\ninstance, here, change the 4 to an n 4058 03:52:07,320 --> 03:52:09,810 so now my program is completely dynamic. 4059 03:52:09,810 --> 03:52:13,710 Let me go ahead and do\nmake mario, ./mario again. 4060 03:52:18,751 --> 03:52:22,140 And the difference here with the\n 4061 03:52:22,140 --> 03:52:25,320 involves getting user input,\nwell, there's no question to ask. 4062 03:52:25,320 --> 03:52:27,161 The user hasn't given you anything yet. 4063 03:52:27,161 --> 03:52:31,111 So you have to do something first,\n 4064 03:52:31,111 --> 03:52:35,808 if the human has, for instance,\ncooperated, in this case. 4065 03:52:35,808 --> 03:52:37,891 All right, well why don't\nwe escalate to something 4066 03:52:37,890 --> 03:52:41,940 more like this in the same game,\n 4067 03:52:41,941 --> 03:52:45,481 and this is like a two-dimensional\nwall that's popping up here? 4068 03:52:45,480 --> 03:52:48,480 It looks like a 3 by 3, for\n 4069 03:52:48,480 --> 03:52:51,931 And it's like, made of bricks, so\n 4070 03:52:51,931 --> 03:52:53,761 Well, it turns out that we can nest-- 4071 03:52:53,761 --> 03:52:57,310 that is, combine-- some of\nthese same ideas as follows. 4072 03:52:57,310 --> 03:53:01,110 Let me go ahead now and\nchange back to this code. 4073 03:53:01,111 --> 03:53:05,501 And I'm going to keep the\ndo while loop from before. 4074 03:53:05,501 --> 03:53:07,441 And I'm going to ask,\nthough, this question 4075 03:53:07,441 --> 03:53:09,091 what's the size of this square? 4076 03:53:09,091 --> 03:53:13,841 I'm going to assume it's n by\nn, so 3 by 3, 4 by 4, whatever. 4077 03:53:13,841 --> 03:53:16,621 So I'm just going to ask for the\nsize of this square of bricks. 4078 03:53:18,373 --> 03:53:20,790 Well, I'm going to go ahead,\nfor instance, and print out-- 4079 03:53:20,790 --> 03:53:25,907 how about for int i =\n0, i less than n, i++. 4080 03:53:25,907 --> 03:53:27,990 Let me just keep it simple\nand print out something 4081 03:53:27,990 --> 03:53:32,790 like this, just a single\nhash symbol that is a brick 4082 03:53:34,800 --> 03:53:36,270 All right, let's make mario. 4083 03:53:38,400 --> 03:53:40,320 OK, that's close to being it. 4084 03:53:41,431 --> 03:53:43,181 All right, but I need it to be wider. 4085 03:53:43,181 --> 03:53:45,751 So the solution last time\nwas to get rid of the newline 4086 03:53:45,751 --> 03:53:49,921 and then maybe put the\nnewline here, after the loop. 4087 03:53:49,921 --> 03:53:55,951 All right, so let's do make mario,\n 4088 03:53:55,950 --> 03:54:00,480 All right, so I kind of need to\ncombine these two ideas somehow. 4089 03:54:00,480 --> 03:54:04,351 So how might we solve this problem? 4090 03:54:04,351 --> 03:54:10,591 I want to print rows and\ncolumns, not row or column. 4091 03:54:12,630 --> 03:54:15,015 AUDIENCE: Add another\nloop in the for loop. 4092 03:54:15,890 --> 03:54:17,570 Add another loop in the for loop, right? 4093 03:54:17,570 --> 03:54:21,890 If you use one loop conceptually\n 4094 03:54:21,890 --> 03:54:24,470 to bottom, and then\nwithin each row, you then 4095 03:54:24,470 --> 03:54:26,870 sort of typewriter style--\nold school typewriter-- 4096 03:54:26,870 --> 03:54:30,020 do like, character, character,\n 4097 03:54:30,021 --> 03:54:32,521 I think we could do exactly\nwhat we want to achieve here. 4098 03:54:33,421 --> 03:54:36,441 Let me get rid of this line and\nget rid of this line for now. 4099 03:54:36,441 --> 03:54:39,201 And let me just give myself\nanother loop on the inside. 4100 03:54:39,200 --> 03:54:42,920 And since I'm already using i,\nanother reasonable convention 4101 03:54:42,921 --> 03:54:45,021 here would be to say something like j. 4102 03:54:45,021 --> 03:54:49,311 So j also gets 0, j is less than n, j++. 4103 03:54:49,310 --> 03:54:51,620 And now, what's going to happen? 4104 03:54:51,620 --> 03:54:56,180 Let me go ahead and print out just\n 4105 03:54:56,181 --> 03:54:58,441 And let me save and let me run this. 4106 03:54:58,441 --> 03:54:59,990 Let me see how close we are. 4107 03:55:01,970 --> 03:55:06,421 OK, three, that's clearly wrong, but\n 4108 03:55:07,581 --> 03:55:12,890 What's the one fix I need now to\n 4109 03:55:12,890 --> 03:55:15,200 down to the next row when appropriate? 4110 03:55:17,140 --> 03:55:19,091 Yeah, I need one of these backslash n's. 4111 03:55:19,091 --> 03:55:23,921 And let me add some comments now to\n 4112 03:55:23,921 --> 03:55:30,970 For each row, for each column,\nhow about print a brick-- 4113 03:55:30,970 --> 03:55:33,100 just to kind of explain the logic? 4114 03:55:33,101 --> 03:55:37,810 And so I add that because\nnow move to next row 4115 03:55:37,810 --> 03:55:40,390 I could do something like\nthis with a backslash n. 4116 03:55:40,390 --> 03:55:43,751 So here is where the comments,\nreally, my pseudocode 4117 03:55:43,751 --> 03:55:46,730 actually kind of illuminates\nthe situation a bit. 4118 03:55:46,730 --> 03:55:51,353 Let me go ahead and recompile\n 4119 03:55:51,353 --> 03:55:53,770 It's not a perfect square,\njust because these hash symbols 4120 03:55:53,771 --> 03:55:57,551 are a little taller than they are wide,\n 4121 03:55:57,550 --> 03:56:03,751 Now I've done something that's quite\n 4122 03:56:03,751 --> 03:56:08,244 All right, so let me pause here\n 4123 03:56:08,244 --> 03:56:10,411 Again, the code's getting\na little more complicated 4124 03:56:10,411 --> 03:56:14,115 but we're just building more\n 4125 03:56:14,115 --> 03:56:15,990 with familiar puzzle\npieces-- some variables 4126 03:56:15,990 --> 03:56:17,341 some loops, some conditionals. 4127 03:56:17,341 --> 03:56:19,411 It's all the same as before. 4128 03:56:20,880 --> 03:56:22,140 Can you multiply strings in C? 4129 03:56:22,890 --> 03:56:25,807 But ask that same question again in\n 4130 03:56:25,808 --> 03:56:27,774 and the answer will be yes. 4131 03:56:29,880 --> 03:56:32,520 In C, you must specify\nthe return type, the name 4132 03:56:32,521 --> 03:56:34,581 of the function, and the\ninputs, or arguments 4133 03:56:34,581 --> 03:56:35,831 to the function in that order. 4134 03:56:35,831 --> 03:56:39,181 And if none of them are applicable,\nyou write the word void. 4135 03:56:39,181 --> 03:56:42,060 So same question as earlier, let\nme kick that can a week or so 4136 03:56:42,060 --> 03:56:44,261 and we'll come back to\nthat and we'll see why. 4137 03:56:44,261 --> 03:56:47,281 But for now, just take on faith\n 4138 03:56:47,281 --> 03:56:49,651 Because main is a little\nspecial, similar to the 4139 03:56:51,031 --> 03:56:54,271 It too was a little special as well. 4140 03:57:02,411 --> 03:57:05,960 If you want to get out of a\nloop early, you could do this. 4141 03:57:05,960 --> 03:57:08,320 So let me answer this question this way. 4142 03:57:08,320 --> 03:57:14,390 An alternative to a do while loop\n 4143 03:57:15,880 --> 03:57:18,290 so do the following forever-- 4144 03:57:18,290 --> 03:57:23,470 let me go ahead and get an inch from\n 4145 03:57:26,771 --> 03:57:28,611 that is, a positive integer-- 4146 03:57:28,611 --> 03:57:32,411 then go ahead and use a\nnew keyword called break. 4147 03:57:32,411 --> 03:57:35,691 This is identical to what we just did. 4148 03:57:37,130 --> 03:57:39,671 It's like a couple extra\nlines, a lot of them are blank. 4149 03:57:39,671 --> 03:57:41,111 And so it's just an alternative. 4150 03:57:41,111 --> 03:57:43,856 But a do while does the same\nthing but a little tighter-- 4151 03:57:43,855 --> 03:57:47,001 if that's in answer to your question. 4152 03:57:47,001 --> 03:57:51,434 All right, so let's now introduce,\n 4153 03:57:51,433 --> 03:57:53,350 that I've kind of been\nbrushing under the rug 4154 03:57:53,351 --> 03:57:55,781 though we did see a little bit\nof evidence of this earlier 4155 03:57:55,781 --> 03:57:57,881 when we tried to add 2\nbillion and 2 billion 4156 03:57:57,880 --> 03:58:02,180 and it overflowed the number\nof bits in an int, so to speak. 4157 03:58:02,181 --> 03:58:06,431 Let me go ahead and code up a\nprogram called calculator again. 4158 03:58:06,431 --> 03:58:09,221 But I'm going to go ahead now\nand change this to floats. 4159 03:58:09,220 --> 03:58:12,820 So I'm going to change x to a float,\n 4160 03:58:12,820 --> 03:58:15,650 And a float, again, is just\na floating point value 4161 03:58:15,650 --> 03:58:19,070 which is a fancy way of saying a real\n 4162 03:58:19,070 --> 03:58:22,480 And down here, I'm going to\ngo ahead and use %f for float. 4163 03:58:22,480 --> 03:58:24,851 And I'm going to go ahead\nnow and do one more thing. 4164 03:58:24,851 --> 03:58:27,643 Instead of addition, I want to do\n 4165 03:58:29,501 --> 03:58:32,050 And I'm going to give myself\nanother third float called z 4166 03:58:32,050 --> 03:58:33,730 as we did at the beginning of today. 4167 03:58:33,730 --> 03:58:37,331 And I'm going to print out z\ninstead of x and y explicitly. 4168 03:58:37,331 --> 03:58:42,011 So I'm going to go ahead now and\n 4169 03:58:42,011 --> 03:58:44,320 And let's do something like, oh, 2/3. 4170 03:58:47,710 --> 03:58:49,600 So that's what you would rather expect. 4171 03:58:52,310 --> 03:58:54,501 All right, so 0.1, and a bunch of zeros. 4172 03:58:54,501 --> 03:58:56,521 That too is what you\nwould rather expect. 4173 03:58:56,521 --> 03:58:58,191 But now let me get a little curious. 4174 03:58:58,191 --> 03:59:02,271 It turns out that in C, you can\n 4175 03:59:03,230 --> 03:59:05,390 By default, you get 6 or so digits. 4176 03:59:05,390 --> 03:59:07,850 Suppose that you want\nto get exactly 2 digits. 4177 03:59:07,851 --> 03:59:11,816 You can more succinctly say 0.2\n 4178 03:59:11,816 --> 03:59:14,691 This is the kind of thing that's\n 4179 03:59:14,691 --> 03:59:16,911 and you find that, OK,\nformat code for floats 4180 03:59:16,911 --> 03:59:19,531 uses 0.2 to do two decimal points. 4181 03:59:19,531 --> 03:59:23,061 So let me do make calculator\nagain, ./calculator. 4182 03:59:25,501 --> 03:59:28,490 So it handles the display of\nsignificant digits for us here. 4183 03:59:28,490 --> 03:59:32,181 And now let me go ahead\nand do 1/10 and 0.10. 4184 03:59:33,494 --> 03:59:35,661 Well, maybe I really want\na lot of precision, right? 4185 03:59:35,661 --> 03:59:37,161 I've got a really powerful computer. 4186 03:59:37,161 --> 03:59:39,681 Let me see 50 numbers\nafter the decimal point. 4187 03:59:39,681 --> 03:59:41,551 That's a lot of significant digits. 4188 03:59:41,550 --> 03:59:44,341 Let me remake the\ncalculator-- whoops, typo. 4189 03:59:44,341 --> 03:59:48,980 Let me remake the calculator,\n./mario calculator. 4190 03:59:54,400 --> 03:59:58,390 Pretty sure it's supposed to be\n 4191 03:59:59,531 --> 04:00:01,239 All right, well, maybe\nthat's just a bug. 4192 04:00:02,290 --> 04:00:05,050 OK, that's really getting funky. 4193 04:00:06,400 --> 04:00:10,448 It seems that my program cannot\nonly not do addition very well-- 4194 04:00:10,448 --> 04:00:12,281 we eventually hit\nproblems in the billions-- 4195 04:00:12,281 --> 04:00:16,861 we can't even do very\nprecise numbers here. 4196 04:00:19,861 --> 04:00:22,201 In a nutshell, the computer's\napproximating the answer 4197 04:00:22,200 --> 04:00:25,142 using that many numbers\nafter the decimal point. 4198 04:00:25,143 --> 04:00:26,851 But the problem\nfundamentally is actually 4199 04:00:26,851 --> 04:00:30,101 very similar to that integer\noverflow from before. 4200 04:00:30,101 --> 04:00:31,801 And I'm using that now as a term of art. 4201 04:00:31,800 --> 04:00:36,210 Integers can overflow if you're trying\n 4202 04:00:37,128 --> 04:00:40,378 You sort of change them all to ones, and\n 4203 04:00:40,378 --> 04:00:42,761 Same thing here, but in the\ndifferent context of floats-- 4204 04:00:42,761 --> 04:00:45,060 if you only have 32\nbits-- or, heck, if we 4205 04:00:45,060 --> 04:00:48,720 change to double and only have 64\n 4206 04:00:50,070 --> 04:00:54,070 And, yet, pretty sure there's an\n 4207 04:00:54,070 --> 04:00:59,040 In the world, which is to say a computer\n 4208 04:00:59,040 --> 04:01:01,413 represent all possible\nnumbers in the world. 4209 04:01:01,414 --> 04:01:03,331 Because, again, there's\nnot an infinite number 4210 04:01:03,331 --> 04:01:06,570 of permutations of 32 or 64 bits. 4211 04:01:06,570 --> 04:01:10,351 It might be a lot, in the billions\n 4212 04:01:10,351 --> 04:01:13,771 And so, indeed, this is the\ncomputer's closest approximation 4213 04:01:13,771 --> 04:01:15,971 to what's actually going on there. 4214 04:01:15,970 --> 04:01:18,990 And so this is an example of what\n 4215 04:01:21,001 --> 04:01:26,341 Floating-point imprecision refers to the\n 4216 04:01:26,341 --> 04:01:29,341 to represent all possible\nreal numbers 100% 4217 04:01:29,341 --> 04:01:33,118 precisely, at least by default\nin languages like C. Thankfully 4218 04:01:33,118 --> 04:01:35,201 in the world of scientific\ncomputing and so forth 4219 04:01:35,200 --> 04:01:38,820 there are solutions to this problem\n 4220 04:01:38,820 --> 04:01:42,070 But the problem fundamentally\nis still going to be there. 4221 04:01:42,070 --> 04:01:44,790 So there's a reason I\nchanged x and y to floats. 4222 04:01:44,790 --> 04:01:47,050 Let's see what would\nhappen if we rewound a bit. 4223 04:01:47,050 --> 04:01:52,501 And instead of using floats for x and y,\n 4224 04:01:52,501 --> 04:01:56,161 And let's go far back\nand use get_int as well 4225 04:01:56,161 --> 04:01:59,138 thereby giving us integers x and y. 4226 04:01:59,138 --> 04:02:01,721 Let's still leave z as a float,\nbecause at the end of the day 4227 04:02:01,720 --> 04:02:04,440 we want to be able to handle\nfractions or floating-point values. 4228 04:02:04,441 --> 04:02:07,381 But let's go ahead now and\nprint out this value of z 4229 04:02:07,380 --> 04:02:09,720 having changed x and y now to ints. 4230 04:02:09,720 --> 04:02:15,390 make calculator, ./calculator, and\n 4231 04:02:16,710 --> 04:02:21,960 And it's not 0.666, and it's\nnot even rounding oddly. 4232 04:02:21,960 --> 04:02:23,760 It's just all zeros this time. 4233 04:02:24,970 --> 04:02:28,770 Well, it turns out that C, when\n 4234 04:02:28,771 --> 04:02:31,771 is always going to give you\nback an integer, an int. 4235 04:02:31,771 --> 04:02:34,951 The problem is that floating-point\nvalues don't fit in ints. 4236 04:02:34,950 --> 04:02:37,650 Only the integral part to the\nleft of the decimal point does. 4237 04:02:37,650 --> 04:02:41,490 Everything at and beyond the decimal\n 4238 04:02:41,490 --> 04:02:44,281 known as a feature in\nC called truncation. 4239 04:02:44,281 --> 04:02:47,591 When dividing an integer by an\ninteger, you get back an integer. 4240 04:02:47,591 --> 04:02:51,181 But if you're trying to then store\n 4241 04:02:51,181 --> 04:02:54,631 result in that integer, C is just\ngoing to throw away everything 4242 04:02:54,630 --> 04:02:57,421 at and beyond the decimal point,\nleaving us with this case 4243 04:02:57,421 --> 04:03:03,521 in just the 0 from what should\nhave been 0.666666 and so forth. 4244 04:03:03,521 --> 04:03:05,230 So let's see one more example, in fact. 4245 04:03:05,229 --> 04:03:06,841 Let me go back to my terminal here. 4246 04:03:06,841 --> 04:03:08,550 Let me do ./calculator again. 4247 04:03:09,751 --> 04:03:13,751 This time, It should be\n1.33333 and so forth. 4248 04:03:13,751 --> 04:03:20,729 But let's see, 4 divided by 3, both as\n 4249 04:03:20,729 --> 04:03:23,761 but there too the\nanswer should be 1.333. 4250 04:03:23,761 --> 04:03:27,871 But the floating-point part is\ngetting truncated or thrown away 4251 04:03:30,370 --> 04:03:33,970 Well, certainly, we could just use\n 4252 04:03:33,970 --> 04:03:37,591 But if, by nature of your program,\n 4253 04:03:37,591 --> 04:03:40,800 or maybe even longs, for which\nthe same problem would occur-- 4254 04:03:40,800 --> 04:03:44,069 what we can actually do\nis called type conversion. 4255 04:03:44,069 --> 04:03:47,040 And we can explicitly tell\nthe computer that we actually 4256 04:03:47,040 --> 04:03:50,229 want to treat this int as though\nit's a floating-point value. 4257 04:03:50,229 --> 04:03:52,020 And we can do that for both x and y. 4258 04:03:52,021 --> 04:03:55,871 So let me go back to my code here, and\n 4259 04:03:55,870 --> 04:04:01,260 I can convert y to a float by\n 4260 04:04:01,261 --> 04:04:04,261 by literally writing the type\nfloat inside of parentheses 4261 04:04:05,431 --> 04:04:08,881 And if I really want to be explicit,\n 4262 04:04:08,880 --> 04:04:12,790 But, strictly speaking, it suffices\n 4263 04:04:14,069 --> 04:04:19,110 Let me go ahead now and do make\ncalculator again, ./calculator 4264 04:04:19,111 --> 04:04:21,900 and let's try 2 divided by 3. 4265 04:04:21,899 --> 04:04:25,139 And now, we're back to an\nanswer that's closer to correct. 4266 04:04:25,139 --> 04:04:27,840 But, indeed, we're still having\nsome rounding issues there. 4267 04:04:27,841 --> 04:04:31,621 Let's run it one more\ntime for 4 divided by 3. 4268 04:04:31,620 --> 04:04:33,989 There too we're closer to\nthe right answer, at least. 4269 04:04:33,989 --> 04:04:36,450 But we still have that\nfloating-point imprecision 4270 04:04:36,450 --> 04:04:39,300 but that's going to be another\nproblem altogether to solve. 4271 04:04:39,300 --> 04:04:41,220 And here in a little\nmore detail is that issue 4272 04:04:41,220 --> 04:04:44,399 of integer overflow, which\nis in the context of ints. 4273 04:04:44,399 --> 04:04:48,389 Suppose that we think back to\nlast week when we had three bits 4274 04:04:48,389 --> 04:04:53,970 and we counted from 0 to\n7, 0, 1, 2, 3, 4, 5, 6, 7. 4275 04:04:53,970 --> 04:04:56,220 I think I asked the question,\nhow would we count to 8? 4276 04:04:56,220 --> 04:04:58,290 Someone proposed, well,\nwe need a fourth bit. 4277 04:04:58,290 --> 04:05:01,559 That's fine if you have a\nfourth bit, if you have access 4278 04:05:01,559 --> 04:05:03,479 to another light bulb or transistor. 4279 04:05:03,479 --> 04:05:09,239 If you don't, though, the next number\n 4280 04:05:09,239 --> 04:05:13,110 But if you don't have space for\nor hardware for that fourth bit 4281 04:05:13,111 --> 04:05:16,341 you might as well just be\nrepresenting the number 0. 4282 04:05:16,341 --> 04:05:19,640 So in the world of integers, if\nyou're only using three bits 4283 04:05:19,639 --> 04:05:23,360 those three bits eventually\noverflow when you count past 7. 4284 04:05:23,361 --> 04:05:28,011 Because what should be 8 can't fit, so\n 4285 04:05:28,011 --> 04:05:30,841 And as arcane as this\nproblem might seem 4286 04:05:30,841 --> 04:05:33,050 we humans have done\nthis a couple of times. 4287 04:05:33,050 --> 04:05:35,327 You might recall\nknowing about or reading 4288 04:05:35,327 --> 04:05:37,161 about the Y2K problem,\nwhere a lot of people 4289 04:05:37,161 --> 04:05:38,460 thought the world was going to end. 4290 04:05:38,960 --> 04:05:43,911 Because on January 1st of\n2000, a lot of computers 4291 04:05:43,911 --> 04:05:48,501 presumably, were going to update their\n 4292 04:05:48,501 --> 04:05:51,831 The problem is, though, for\ndecades, for efficiency, we humans 4293 04:05:51,831 --> 04:05:54,710 were honestly in the habit of\nnot storing years as four digits. 4294 04:05:55,233 --> 04:05:58,191 Because that's just a lot of space\n 4295 04:05:59,460 --> 04:06:02,540 So a lot of computer systems,\nespecially early on when 4296 04:06:02,540 --> 04:06:05,780 hardware was very expensive\nand memory was very tight 4297 04:06:05,781 --> 04:06:08,390 just stored the last\ntwo digits of any year. 4298 04:06:08,389 --> 04:06:14,270 The problem, of course, on January 1st\n 4299 04:06:14,271 --> 04:06:19,341 But if you don't have room for\nanother digit it's just 00. 4300 04:06:19,341 --> 04:06:23,720 And if your code assumes a prefix of\n 4301 04:06:23,720 --> 04:06:26,862 1999 back to the year 1900. 4302 04:06:26,862 --> 04:06:29,570 Thankfully, long story short, a\n 4303 04:06:29,570 --> 04:06:32,880 in a lot of old languages and\nmostly warded off this problem 4304 04:06:34,400 --> 04:06:39,980 The next time the world might end\n 4305 04:06:39,980 --> 04:06:42,560 Now, that might feel\nlike a long time away 4306 04:06:42,560 --> 04:06:44,931 but so did the year 2000, at one point. 4307 04:06:44,931 --> 04:06:51,351 Why might clocks again break in\n 4308 04:06:54,980 --> 04:06:57,150 So this refers to some\nnumber of seconds. 4309 04:06:57,150 --> 04:07:00,560 So it turns out that the way\n 4310 04:07:00,560 --> 04:07:04,130 is they count the total number\nof seconds since the epoch, which 4311 04:07:04,130 --> 04:07:06,620 is defined as January 1, 1970. 4312 04:07:07,220 --> 04:07:09,890 It was just a good year\nto start counting at 4313 04:07:09,890 --> 04:07:11,880 when computers really\ncame onto the scene. 4314 04:07:11,880 --> 04:07:16,400 Unfortunately, most computers used 32\n 4315 04:07:16,400 --> 04:07:20,331 since January 1, 1970, the\nimplication of which is we 4316 04:07:20,331 --> 04:07:23,210 can only count up to\nroughly 2 billion seconds. 4317 04:07:23,210 --> 04:07:29,810 2 billion seconds is going to\nhappen in 2038, at which 30 11's 4318 04:07:29,810 --> 04:07:31,700 are going to roll over as follows. 4319 04:07:31,700 --> 04:07:34,898 That number 2 billion,\nwhich is the max-- 4320 04:07:34,898 --> 04:07:37,440 because if you're representing\npositive and negative numbers 4321 04:07:37,441 --> 04:07:39,941 recall that you can only count\nas high as positive 2 billion 4322 04:07:42,380 --> 04:07:44,421 This is roughly the number\n2 billion in binary. 4323 04:07:44,421 --> 04:07:47,041 It's all ones with one\nzero way over here. 4324 04:07:47,040 --> 04:07:50,990 If I count one second past that\n2 billion number, give or take-- 4325 04:07:50,990 --> 04:07:54,050 that means, all right,\nI add 1, I carry the 1-- 4326 04:07:54,050 --> 04:07:57,290 it's just like 9's\nbecoming 0's in decimal. 4327 04:07:57,290 --> 04:08:01,161 If I keep this sort of simple\n 4328 04:08:01,161 --> 04:08:05,451 carrying the 1, carrying the 1, 1 second\n 4329 04:08:05,450 --> 04:08:08,220 I have this number in\nthe computer's memory. 4330 04:08:08,220 --> 04:08:11,780 So there's still 1 bit that's\na 1 all the way to the left. 4331 04:08:11,781 --> 04:08:16,041 Unfortunately, that bit\noften represents negativity 4332 04:08:16,040 --> 04:08:20,720 whereby if that first bit is negative,\n 4333 04:08:20,720 --> 04:08:22,280 somehow represents a negative number. 4334 04:08:23,331 --> 04:08:24,831 There's a fancier representation. 4335 04:08:24,831 --> 04:08:27,800 But a very big, positive\nnumber very suddenly 4336 04:08:27,800 --> 04:08:29,690 becomes a very big, negative number. 4337 04:08:29,691 --> 04:08:32,671 And that number is roughly\nnegative 2 billion. 4338 04:08:32,671 --> 04:08:35,541 That means computers\nin 2038 on that date 4339 04:08:35,540 --> 04:08:37,970 are going to accidentally\nthink that it's 4340 04:08:37,970 --> 04:08:43,610 been negative 2 billion seconds since\n 4341 04:08:43,611 --> 04:08:46,731 computers potentially think it's 1901. 4342 04:08:46,730 --> 04:08:50,990 So what is the solution to\nthe 2038 problem, perhaps? 4343 04:08:50,990 --> 04:08:53,601 Y2K was because we were\nusing two digits for years. 4344 04:08:55,970 --> 04:09:00,591 And, thankfully, we're getting a\n 4345 04:09:00,591 --> 04:09:03,320 and computers now are\nincreasingly using 64 bits. 4346 04:09:03,320 --> 04:09:05,716 And all of us will be long\ngone by the time we run out 4347 04:09:05,716 --> 04:09:08,091 of that number of seconds, so\nit's someone else's problem 4348 04:09:09,841 --> 04:09:11,774 But that's really the\nfundamental solution. 4349 04:09:11,773 --> 04:09:13,940 If you're running up against\nsomething finite, well 4350 04:09:13,941 --> 04:09:16,371 just kick the can further and\njust give yourself more bits. 4351 04:09:16,370 --> 04:09:18,710 And, frankly, because hardware\nis so much cheaper these days 4352 04:09:18,710 --> 04:09:21,200 computers are so much faster,\nit's not as big of a deal 4353 04:09:21,200 --> 04:09:22,730 as it might have been decades ago. 4354 04:09:22,730 --> 04:09:24,501 But that's indeed the solution. 4355 04:09:24,501 --> 04:09:27,291 But this arises in very common contexts. 4356 04:09:27,290 --> 04:09:32,120 In fact, let me go ahead and write a\n 4357 04:09:32,120 --> 04:09:35,150 You might think that just converting\n 4358 04:09:35,150 --> 04:09:37,730 might be simple, but let\nme go ahead and do this. 4359 04:09:37,730 --> 04:09:41,331 In pennies.c, I'm going to\ngo ahead and include cs50.h. 4360 04:09:41,331 --> 04:09:48,271 And I'm going to include stdio.h,\n 4361 04:09:48,271 --> 04:09:50,400 And now down here, I'm going to do this. 4362 04:09:50,400 --> 04:09:52,370 I'm going to get a float\ncalled amount, and I'm 4363 04:09:52,370 --> 04:09:56,540 going to ask the user for some amount\n 4364 04:09:56,540 --> 04:09:59,270 and I'm going to store that\nin a variable called amount. 4365 04:09:59,271 --> 04:10:07,461 Then I'm going to simply convert that\n 4366 04:10:10,111 --> 04:10:16,281 And then I'm going to go ahead and print\n 4367 04:10:16,281 --> 04:10:18,231 because that's just an\ninteger in pennies-- 4368 04:10:18,230 --> 04:10:22,490 backslash n, quote\nunquote, comma, pennies. 4369 04:10:22,490 --> 04:10:26,181 All right, so if I didn't make any\n 4370 04:10:27,800 --> 04:10:31,700 And suppose I have, say, $0.99, so 0.99. 4371 04:10:41,310 --> 04:10:42,931 There's that imprecision issue. 4372 04:10:42,931 --> 04:10:45,240 And this isn't even\nthat big of an amount. 4373 04:10:45,240 --> 04:10:49,302 Now, not a big deal if the cashier gives\n 4374 04:10:49,302 --> 04:10:50,761 but you can imagine this adding up. 4375 04:10:50,761 --> 04:10:54,240 You can imagine this being worrisome\nfor financial implications 4376 04:10:54,240 --> 04:10:57,630 for financial transactions, for\n 4377 04:10:57,630 --> 04:10:59,860 My program can't even handle this. 4378 04:10:59,861 --> 04:11:01,981 Well, there are some solutions here. 4379 04:11:01,980 --> 04:11:04,171 And it looks like what's\nreally happening-- 4380 04:11:04,171 --> 04:11:08,730 if I print it out using the %f with a\n 4381 04:11:09,450 --> 04:11:14,280 presumably, the computer is struggling\n 4382 04:11:14,281 --> 04:11:20,761 It's probably storing 4 dollars\nand 19.9999-something cents. 4383 04:11:20,761 --> 04:11:23,551 So it's close, but it's not quite there. 4384 04:11:23,550 --> 04:11:28,081 So I could at least solve this\nby rounding up, for instance. 4385 04:11:28,081 --> 04:11:31,081 And it turns out there is\na round function out there. 4386 04:11:31,081 --> 04:11:33,841 And it turns out that it's in a\nlibrary called the math library. 4387 04:11:33,841 --> 04:11:36,841 And you would know this by looking\n 4388 04:11:37,950 --> 04:11:43,800 And if I now make pennies again and\n 4389 04:11:46,171 --> 04:11:49,531 So at least in this context, it\nseems like a solvable problem. 4390 04:11:49,531 --> 04:11:53,461 But it's certainly something I\n 4391 04:11:53,460 --> 04:11:57,180 Unfortunately, even professional,\n 4392 04:11:57,181 --> 04:12:00,073 have not been particularly\nattentive to these kinds of details. 4393 04:12:00,073 --> 04:12:03,031 And in a class like this, the goal\n 4394 04:12:03,031 --> 04:12:06,551 but to really teach you what's going\n 4395 04:12:06,550 --> 04:12:09,600 so that you have a bottom-up\nunderstanding of how data 4396 04:12:09,601 --> 04:12:11,980 is represented, how computers\nare manipulating it 4397 04:12:11,980 --> 04:12:16,060 so that you are not on the failing\n 4398 04:12:16,060 --> 04:12:19,360 And so that we as a society are not\n 4399 04:12:19,861 --> 04:12:22,231 And this happens,\nunfortunately, all of the time. 4400 04:12:22,230 --> 04:12:26,011 This is a Boeing airplane\nthat a few years ago needed 4401 04:12:26,011 --> 04:12:29,621 to be rebooted after every 248 days. 4402 04:12:30,120 --> 04:12:34,740 Because this Boeing airplane software\n 4403 04:12:34,740 --> 04:12:36,990 tenths of a second to keep\ntrack of something or other 4404 04:12:36,990 --> 04:12:38,761 related to its electrical power. 4405 04:12:38,761 --> 04:12:43,470 And, unfortunately, after 248 days of\n 4406 04:12:43,470 --> 04:12:45,690 which in the airline\nindustry is apparently not 4407 04:12:45,691 --> 04:12:49,381 uncommon to make every dollar count,\n 4408 04:12:50,581 --> 04:12:54,511 the 32-bit number would\nroll over and the power 4409 04:12:54,511 --> 04:12:57,601 would shut off on the airplane\nas a side effect because of sort 4410 04:12:57,601 --> 04:12:59,681 of undefined behavior in that case. 4411 04:12:59,681 --> 04:13:02,941 The temporary solution by Boeing at\n 4412 04:13:02,941 --> 04:13:06,150 sort of operating system style,\n 4413 04:13:06,150 --> 04:13:09,841 And that was indeed the fix until they\n 4414 04:13:11,351 --> 04:13:14,400 And the more hardware we carry\n 4415 04:13:14,400 --> 04:13:17,550 use these kinds of devices,\nthe more of these problems 4416 04:13:17,550 --> 04:13:20,430 we're going to run into down the road. 4417 04:14:42,261 --> 04:14:45,086 DAVID MALAN: This is\nCS50 and this is week 2. 4418 04:14:45,085 --> 04:14:47,710 Now that you have some programming\nexperience under your belts 4419 04:14:47,710 --> 04:14:50,170 in this more arcane language called c. 4420 04:14:50,171 --> 04:14:53,050 Among our goals today is to help\n 4421 04:14:53,050 --> 04:14:54,911 been doing these past several days. 4422 04:14:54,911 --> 04:14:58,216 Wrestling with your first programs in\n 4423 04:14:58,216 --> 04:15:00,341 up understanding of what\nsome of these commands do. 4424 04:15:00,341 --> 04:15:02,841 And, ultimately, what more\nwe can do with this language. 4425 04:15:02,841 --> 04:15:06,011 So this recall was the very\nfirst program you wrote 4426 04:15:06,011 --> 04:15:09,131 I wrote in this language\ncalled C, much more textual 4427 04:15:09,130 --> 04:15:11,230 certainly, than the Scratch equivalent. 4428 04:15:11,230 --> 04:15:15,460 But at the end of the day,\ncomputers, your Mac, your PC 4429 04:15:15,460 --> 04:15:18,815 VS Code doesn't understand\nthis actual code. 4430 04:15:18,816 --> 04:15:21,941 What's the format into which we need\n 4431 04:15:23,462 --> 04:15:26,050 DAVID MALAN: So binary,\notherwise known as machine code. 4432 04:15:26,550 --> 04:15:30,130 The 0s and 1s that your computer\nactually does understand. 4433 04:15:30,130 --> 04:15:32,290 So somehow we need to\nget to this format. 4434 04:15:32,290 --> 04:15:34,990 And up until now, we've been\nusing this command called make 4435 04:15:34,990 --> 04:15:37,931 which is aptly named, because\nit lets you make programs. 4436 04:15:37,931 --> 04:15:40,691 And the invocation of that\nhas been pretty simple. 4437 04:15:40,691 --> 04:15:44,711 Make hello looks in your current\n 4438 04:15:44,710 --> 04:15:49,360 hello.c, implicitly, and then it\n 4439 04:15:49,361 --> 04:15:51,911 which itself is executable,\nwhich just means runnable 4440 04:15:51,911 --> 04:15:54,161 so that you can then do ./hello. 4441 04:15:54,161 --> 04:15:58,451 But it turns out that make is\nactually not a compiler itself. 4442 04:15:58,450 --> 04:16:00,100 It does help you make programs. 4443 04:16:00,101 --> 04:16:04,781 But make is this utility that comes on\n 4444 04:16:04,781 --> 04:16:08,321 to actually compile code by\nusing an actual compiler 4445 04:16:08,320 --> 04:16:12,550 the program that converts source code\n 4446 04:16:12,550 --> 04:16:14,921 or whatever cloud environment\nyou might be using. 4447 04:16:14,921 --> 04:16:17,591 In fact, what make is\ndoing for us, is actually 4448 04:16:17,591 --> 04:16:21,490 running a command automatically\nknown as clang, for C language. 4449 04:16:21,490 --> 04:16:25,851 And, so here, for instance, in VS\n 4450 04:16:25,851 --> 04:16:27,730 this time in the context\nof a text editor 4451 04:16:27,730 --> 04:16:30,940 and I could compile\nthis with make hello. 4452 04:16:30,941 --> 04:16:33,828 Let me go ahead and use the\ncompiler itself manually. 4453 04:16:33,827 --> 04:16:36,911 And we'll see in a moment why we've\n 4454 04:16:36,911 --> 04:16:39,320 I'm going to run clang instead. 4455 04:16:39,320 --> 04:16:41,601 And then I'm going to run hello.c. 4456 04:16:41,601 --> 04:16:43,751 So it's a little different\nhow the compiler's used. 4457 04:16:43,751 --> 04:16:46,421 It needs to know, explicitly,\nwhat the file is called. 4458 04:16:46,421 --> 04:16:49,541 I'll go ahead and run\nclang, hello.c, Enter. 4459 04:16:49,540 --> 04:16:52,675 Nothing seems to happen, which,\n 4460 04:16:52,675 --> 04:16:54,050 Because no errors have popped up. 4461 04:16:54,050 --> 04:17:00,400 And if I do ls for list, you'll see\n 4462 04:17:00,400 --> 04:17:03,490 But there is a curiously-named\nfile called a.out. 4463 04:17:03,490 --> 04:17:06,880 This is a historical convention,\nstands for assembler output. 4464 04:17:06,880 --> 04:17:09,640 And this is, just, the default\nfile name for a program 4465 04:17:09,640 --> 04:17:13,661 that you might compile yourself,\nmanually, using clang itself. 4466 04:17:13,661 --> 04:17:16,091 Let me go ahead now and\npoint out that that's 4467 04:17:16,091 --> 04:17:17,601 kind of a stupid name for a program. 4468 04:17:17,601 --> 04:17:20,695 Even though it works,\n./a.out would work. 4469 04:17:20,695 --> 04:17:23,320 But if you actually want to\ncustomize the name of your program 4470 04:17:23,320 --> 04:17:26,980 we could just resort to make,\nor we could do explicitly 4471 04:17:28,181 --> 04:17:31,031 It turns out, some\nprograms, among them make 4472 04:17:31,031 --> 04:17:33,251 support what are called\ncommand line arguments 4473 04:17:33,251 --> 04:17:34,570 and more on those later today. 4474 04:17:34,570 --> 04:17:37,931 But these are literally words or\n 4475 04:17:37,931 --> 04:17:41,591 after the name of a program that just\n 4476 04:17:44,300 --> 04:17:47,200 And it turns out, if you read\nthe documentation for clang 4477 04:17:47,200 --> 04:17:52,300 you can actually pass a -o, for\n 4478 04:17:52,300 --> 04:17:54,520 lets you specify,\nexplicitly what do you want 4479 04:17:54,521 --> 04:17:56,056 your outputted program to be called? 4480 04:17:56,056 --> 04:17:58,931 And then you go ahead and type the\n 4481 04:17:58,931 --> 04:18:01,371 want to compile, from\nsource code to machine code. 4482 04:18:02,980 --> 04:18:06,251 Again, nothing seems to happen,\nand I type ls and voila. 4483 04:18:06,251 --> 04:18:09,271 Now we still have the old a.out,\nbecause I didn't delete it yet. 4484 04:18:10,271 --> 04:18:14,681 So ./hello, voila, runs\nhello, world again. 4485 04:18:14,681 --> 04:18:16,421 And let me go ahead\nand remove this file. 4486 04:18:16,421 --> 04:18:20,854 I could, of course, resort to using\n 4487 04:18:20,853 --> 04:18:23,770 Which, I am in the habit of closing,\n 4488 04:18:23,771 --> 04:18:26,501 But I could go ahead and right-click\nor control-click on a.out 4489 04:18:26,501 --> 04:18:27,626 if I want to get rid of it. 4490 04:18:27,626 --> 04:18:30,560 Or again, let me focus on\nthe command line interface. 4491 04:18:32,290 --> 04:18:35,260 We didn't really use it much,\nbut what command removes a file? 4492 04:18:36,925 --> 04:18:40,690 DAVID MALAN: So rm for\nremove. rm, a.out, Enter. 4493 04:18:40,691 --> 04:18:44,320 Remove regular file,\na.out, y for yes, enter. 4494 04:18:44,320 --> 04:18:46,900 And now, if I do ls\nagain, voila, it's gone. 4495 04:18:46,900 --> 04:18:48,911 All right, so, let's\nnow enhance this program 4496 04:18:48,911 --> 04:18:54,550 to do the second version we ever did,\n 4497 04:18:54,550 --> 04:18:57,409 so that we have access to functions\n 4498 04:18:57,409 --> 04:19:04,601 Let me do string, name, gets,\nget string, what's your name 4499 04:19:05,810 --> 04:19:10,270 And now, let me go ahead and say hello\n 4500 04:19:11,181 --> 04:19:13,421 So this was version 2 of\nour program last time 4501 04:19:13,421 --> 04:19:17,560 that very easily compiled with make\n 4502 04:19:17,560 --> 04:19:20,620 If I want to compile this\nthing myself with clang, using 4503 04:19:20,620 --> 04:19:22,780 that same lesson learned,\nall right, let's do it. 4504 04:19:22,781 --> 04:19:29,561 clang-o, hello, just so I get a better\n 4505 04:19:29,560 --> 04:19:34,011 And a new error pops up that some of\n 4506 04:19:34,011 --> 04:19:37,841 So it's a bit arcane here, and there's\n 4507 04:19:37,841 --> 04:19:39,591 with temp for temporary there. 4508 04:19:39,591 --> 04:19:42,820 But somehow, my issue's in\nmain, as we can see here. 4509 04:19:42,820 --> 04:19:44,518 It somehow relates to hello.c. 4510 04:19:44,518 --> 04:19:47,351 Even though we might not have seen\n 4511 04:19:47,351 --> 04:19:50,230 but there's an undefined\nreference to get string. 4512 04:19:50,230 --> 04:19:52,060 As though get string doesn't exist. 4513 04:19:52,060 --> 04:19:55,601 Now, your first instinct might be, well\n 4514 04:19:56,441 --> 04:19:58,570 That's the very first\nline of my program. 4515 04:19:58,570 --> 04:20:02,171 But it turns out, make is doing\n 4516 04:20:02,171 --> 04:20:06,191 Just putting cs50.h, or any header\nfile at the top of your code 4517 04:20:06,191 --> 04:20:10,990 for that matter, just teaches the\n 4518 04:20:10,990 --> 04:20:13,570 It, sort of, asks the compiler\nto-- it asks the compiler 4519 04:20:13,570 --> 04:20:16,870 to trust that I will, eventually,\n 4520 04:20:16,870 --> 04:20:22,390 like get string, and cs50.h,\nand stdio.h, printf, therein. 4521 04:20:22,390 --> 04:20:28,091 But this error here, some kind of\n 4522 04:20:28,091 --> 04:20:30,220 that there's a separate\nprocess for actually 4523 04:20:30,220 --> 04:20:34,540 finding the 0s and 1s that\ncs50 compiled long ago for you. 4524 04:20:34,540 --> 04:20:38,110 That authors of this operating\n 4525 04:20:39,161 --> 04:20:42,101 We need to, somehow,\ntell the compiler that we 4526 04:20:42,101 --> 04:20:44,711 need to link in code\nthat someone else wrote 4527 04:20:44,710 --> 04:20:48,010 the actual machine code that someone\n 4528 04:20:48,011 --> 04:20:51,757 So to do that, you'd have to\ntype -lcs50, for instance 4529 04:20:52,841 --> 04:20:55,809 So additionally, telling clang\n 4530 04:20:55,808 --> 04:20:58,600 a file called hello, and you want\n 4531 04:20:58,601 --> 04:21:03,461 you also want to quote-unquote\nlink in a bunch of 0s and 1s 4532 04:21:03,460 --> 04:21:07,270 that collectively implement\nget string and printf. 4533 04:21:07,271 --> 04:21:11,480 So now, if I hit enter,\nthis time it compiled OK. 4534 04:21:11,480 --> 04:21:17,403 And now if I run ./hello, it works\n 4535 04:21:17,403 --> 04:21:20,361 But honestly, this is just going to\n 4536 04:21:20,361 --> 04:21:22,191 Notice, already, just\nto compile my code 4537 04:21:22,191 --> 04:21:25,677 I have to run clang-o,\nhello, hello.c, lcs50 4538 04:21:25,677 --> 04:21:27,761 and you're going to have\nto type more things, too. 4539 04:21:27,761 --> 04:21:31,150 If you wanted to use the math library,\n 4540 04:21:31,150 --> 04:21:33,700 you would also have\nto do -lm, typically 4541 04:21:33,700 --> 04:21:37,150 to specify give me the math\nbits that someone else compiled. 4542 04:21:37,150 --> 04:21:39,230 And the commands just\nget longer and longer. 4543 04:21:39,230 --> 04:21:43,780 So moving forward, we won't have\n 4544 04:21:43,781 --> 04:21:45,591 but clang is, indeed, the compiler. 4545 04:21:45,591 --> 04:21:48,640 That is the program that converts\n 4546 04:21:48,640 --> 04:21:52,699 But we'll continue to use make because\n 4547 04:21:52,699 --> 04:21:54,490 And the commands are\nonly going to get more 4548 04:21:54,490 --> 04:21:58,900 cryptic the more sophisticated and\n 4549 04:21:58,900 --> 04:22:03,880 And make, again, is just a tool\nthat makes all that happen. 4550 04:22:03,880 --> 04:22:08,560 Let me pause there to see if\n 4551 04:22:08,560 --> 04:22:10,150 take a look further under the hood. 4552 04:22:11,445 --> 04:22:14,445 AUDIENCE: Can you explain again what\n 4553 04:22:14,445 --> 04:22:16,779 DAVID MALAN: Sure, let me\ncome back to that in a moment. 4554 04:22:18,011 --> 04:22:20,177 We'll come back to that,\nvisually, in just a moment. 4555 04:22:20,177 --> 04:22:23,111 But it means to link in the\n0s and 1s that collectively 4556 04:22:23,111 --> 04:22:24,695 implement get string and printf. 4557 04:22:24,695 --> 04:22:26,320 But we'll see that, visually, in a sec. 4558 04:22:31,334 --> 04:22:32,751 DAVID MALAN: Really good question. 4559 04:22:32,751 --> 04:22:35,111 How come I didn't have\nto link in standard I/O? 4560 04:22:35,111 --> 04:22:37,211 Because I used printf in version 1. 4561 04:22:37,210 --> 04:22:40,540 Standard I/O is just, literally,\nso standard that it's built in 4562 04:22:43,060 --> 04:22:45,341 It did not come with the\nlanguage C or the compiler. 4563 04:22:46,511 --> 04:22:50,861 And other libraries, even though\n 4564 04:22:50,861 --> 04:22:54,861 they might not be enabled by default,\n 4565 04:22:54,861 --> 04:22:57,731 So you're not loading more 0s\nand 1s into the computer's memory 4566 04:22:58,540 --> 04:23:01,510 So standard I/O is special, if you will. 4567 04:23:05,681 --> 04:23:07,421 DAVID MALAN: Oh, what does the -o mean? 4568 04:23:07,421 --> 04:23:10,451 So -o is shorthand for\nthe English word output 4569 04:23:10,450 --> 04:23:15,520 and so -o is telling clang to\nplease output a file called hello 4570 04:23:15,521 --> 04:23:18,111 because the next thing I\nwrote after the command line 4571 04:23:18,111 --> 04:23:24,190 recall was clang -o hello, then\n 4572 04:23:24,190 --> 04:23:27,667 And this is where these commands\ndo get and stay fairly arcane. 4573 04:23:27,667 --> 04:23:29,501 It's just through muscle\nmemory and practice 4574 04:23:29,501 --> 04:23:31,871 that you'll start to remember, oh\n 4575 04:23:31,870 --> 04:23:34,537 what are the command line arguments\nyou can provide to programs? 4576 04:23:34,538 --> 04:23:35,831 But we've seen this before. 4577 04:23:35,831 --> 04:23:39,040 Technically, when you run make\n 4578 04:23:39,040 --> 04:23:41,240 hello is the command line argument. 4579 04:23:41,240 --> 04:23:43,300 It's an input to the\nmake function, albeit 4580 04:23:43,300 --> 04:23:46,510 typed at the prompt, that tells\nmake what you want to make. 4581 04:23:46,511 --> 04:23:50,441 Even when I used rm a moment\nago, and did rm of a.out 4582 04:23:50,441 --> 04:23:52,541 the command line argument\nthere was called a.out 4583 04:23:52,540 --> 04:23:55,000 and it's telling rm what to delete. 4584 04:23:55,001 --> 04:23:59,531 It is entirely dependent on the programs\n 4585 04:23:59,531 --> 04:24:02,351 whether you use dash this\nor dash that, but we'll 4586 04:24:02,351 --> 04:24:05,066 see over time, which ones\nactually matter in practice. 4587 04:24:05,066 --> 04:24:10,481 So to come back to the first question\n 4588 04:24:10,480 --> 04:24:12,823 let's consider the code more closely. 4589 04:24:12,823 --> 04:24:14,531 So here is that first\nversion of the code 4590 04:24:14,531 --> 04:24:18,851 again, with stdio.h and only\nprintf, so no cs50 stuff yet. 4591 04:24:18,851 --> 04:24:21,101 Until we add it back in\nand had the second version 4592 04:24:21,101 --> 04:24:23,891 where we actually get the human's name. 4593 04:24:23,890 --> 04:24:27,043 When you run this command,\nthere's a few things 4594 04:24:27,043 --> 04:24:28,960 that are happening\nunderneath the hood, and we 4595 04:24:28,960 --> 04:24:30,911 won't dwell on these\nkinds of details, indeed 4596 04:24:30,911 --> 04:24:33,130 we'll abstract it away by using make. 4597 04:24:33,130 --> 04:24:35,200 But it's worth understanding\nfrom the get-go 4598 04:24:35,200 --> 04:24:38,140 how much automation is going on, so\n 4599 04:24:39,111 --> 04:24:42,201 You have this bottom-up\nunderstanding of what's going on. 4600 04:24:42,200 --> 04:24:45,790 So when we say you've been\ncompiling your code with make 4601 04:24:45,790 --> 04:24:47,860 that's a bit of an oversimplification. 4602 04:24:47,861 --> 04:24:51,041 Technically, every time\nyou compile your code 4603 04:24:51,040 --> 04:24:53,831 you're having the computer do\nfour distinct things for you. 4604 04:24:53,831 --> 04:24:57,281 And this is not four distinct things\n 4605 04:24:57,281 --> 04:24:59,441 every time you run your\nprogram, what's happening 4606 04:24:59,441 --> 04:25:02,081 but it helps to break it\ndown into building blocks 4607 04:25:02,081 --> 04:25:06,371 as to how we're getting from source\n 4608 04:25:06,370 --> 04:25:10,900 It turns out, that when you compile,\n 4609 04:25:10,900 --> 04:25:14,771 speaking, you're doing four things\n 4610 04:25:14,771 --> 04:25:18,221 Preprocessing it, compiling it,\nassembling it, and linking it. 4611 04:25:18,220 --> 04:25:21,610 Just humans decided, let's just\n 4612 04:25:21,611 --> 04:25:24,490 But for a moment, let's\nconsider what these steps are. 4613 04:25:24,490 --> 04:25:26,950 So preprocessing refers to this. 4614 04:25:26,950 --> 04:25:30,970 If we look at our source code,\n 4615 04:25:30,970 --> 04:25:34,702 and therefore get string, notice that\n 4616 04:25:34,702 --> 04:25:36,911 And they're kind of special\nversus all the other code 4617 04:25:36,911 --> 04:25:39,970 we've written, because they start\n 4618 04:25:39,970 --> 04:25:41,921 And that's sort of a\nspecial syntax that means 4619 04:25:41,921 --> 04:25:44,861 that these are, technically,\ncalled preprocessor directives. 4620 04:25:44,861 --> 04:25:49,551 Fancy way of saying they're handled\n 4621 04:25:49,550 --> 04:25:54,130 In fact, if we focus on\ncs50.h, recall from last week 4622 04:25:54,130 --> 04:26:00,130 that I provided a hint as to what's\n 4623 04:26:00,130 --> 04:26:04,840 What was the one salient thing that\n 4624 04:26:04,841 --> 04:26:07,736 why we were including\nit in the first place? 4625 04:26:08,611 --> 04:26:11,111 DAVID MALAN: So get\nstring, specifically 4626 04:26:11,111 --> 04:26:13,421 the prototype for get string. 4627 04:26:13,421 --> 04:26:15,671 We haven't made many of\nour own functions yet 4628 04:26:15,671 --> 04:26:18,101 but recall that any time\nwe've made our own functions 4629 04:26:18,101 --> 04:26:20,591 and we've written them\nbelow main in a file 4630 04:26:20,591 --> 04:26:23,050 we've also had to, somewhat\nstupidly, copy paste 4631 04:26:23,050 --> 04:26:25,630 the prototype of the function\nat the top of the file 4632 04:26:25,630 --> 04:26:29,470 just to teach the compiler that\n 4633 04:26:29,470 --> 04:26:31,690 it does down there, but it will exist. 4634 04:26:32,560 --> 04:26:35,240 So again, that's what these\nprototypes are doing for us. 4635 04:26:35,240 --> 04:26:37,601 So therefore, in my\ncode, If I want to use 4636 04:26:37,601 --> 04:26:41,021 a function like get string,\nor printf, for that matter 4637 04:26:41,021 --> 04:26:43,411 they're not implemented\nclearly in the same file 4638 04:26:43,411 --> 04:26:44,661 they're implemented elsewhere. 4639 04:26:44,661 --> 04:26:46,952 So I need to tell the compiler\nto trust me that they're 4640 04:26:46,952 --> 04:26:48,261 implemented somewhere else. 4641 04:26:48,261 --> 04:26:51,070 And so technically,\ninside of cs50.h, which 4642 04:26:51,070 --> 04:26:54,671 is installed somewhere in the\ncloud's hard drive, so to speak 4643 04:26:54,671 --> 04:26:59,081 that you all are accessing via VS Code,\n 4644 04:26:59,081 --> 04:27:03,130 A prototype for the get string function\n 4645 04:27:03,130 --> 04:27:07,090 get string, it takes one input,\nor argument, called prompt 4646 04:27:07,091 --> 04:27:09,970 and that type of that\nprompt is a string. 4647 04:27:09,970 --> 04:27:15,411 Get string, not surprisingly, has a\n 4648 04:27:15,411 --> 04:27:19,060 So literally, that line and a\nbunch of others, are in cs50.h. 4649 04:27:19,060 --> 04:27:22,540 So rather than you all having\nto copy paste the prototype 4650 04:27:22,540 --> 04:27:25,420 you can just trust that\ncs50 figured out what it is. 4651 04:27:25,421 --> 04:27:29,230 You can include cs50.h\nand the compiler is going 4652 04:27:29,230 --> 04:27:31,681 to go find that prototype for you. 4653 04:27:31,681 --> 04:27:33,740 Same thing in standard\nI/O. Someone else-- what 4654 04:27:33,740 --> 04:27:37,880 must clearly be in stdio.h,\namong other stuff, that 4655 04:27:37,880 --> 04:27:41,850 motivates our including stdio.h, too? 4656 04:27:43,058 --> 04:27:45,290 DAVID MALAN: Printf, the\nprototype for printf 4657 04:27:45,290 --> 04:27:48,270 and I'll just change it here\nin yellow, to be the same. 4658 04:27:48,271 --> 04:27:49,671 And it turns out, the format-- 4659 04:27:49,671 --> 04:27:52,851 the prototype for printf\nis, actually, pretty fancy 4660 04:27:52,851 --> 04:27:56,001 because, as you might have noticed,\n 4661 04:27:56,001 --> 04:28:00,171 something to print, 2, if you want\n 4662 04:28:00,171 --> 04:28:02,880 So the dot dot dot just\nrepresents exactly that. 4663 04:28:02,880 --> 04:28:06,590 It's not quite as simple a prototype\n 4664 04:28:07,376 --> 04:28:10,310 So what does it mean to\npreprocess your code? 4665 04:28:10,310 --> 04:28:14,120 The very first thing the\ncompiler, clang, in this case 4666 04:28:14,120 --> 04:28:18,530 is doing for you when it reads your\n 4667 04:28:18,531 --> 04:28:22,221 notices, oh, here is hash include,\n 4668 04:28:22,220 --> 04:28:27,350 And it, essentially, finds those files\n 4669 04:28:27,351 --> 04:28:31,251 and does the equivalent of copying\n 4670 04:28:31,251 --> 04:28:33,621 into your code at the very top. 4671 04:28:33,620 --> 04:28:36,710 Thereby teaching the compiler\nthat gets string and printf 4672 04:28:36,710 --> 04:28:38,690 will eventually exist somewhere. 4673 04:28:38,691 --> 04:28:42,740 So that's the preprocessing\nstep, whereby, again, it's 4674 04:28:42,740 --> 04:28:46,341 just doing a find-and-replace of\n 4675 04:28:46,341 --> 04:28:48,771 It's plugging in the files\nthere so that you, essentially 4676 04:28:48,771 --> 04:28:52,041 get all the prototypes\nyou need automatically. 4677 04:28:53,091 --> 04:28:55,490 What does it mean, then,\nto compile the results? 4678 04:28:55,490 --> 04:28:57,710 Because at this point\nin the story, your code 4679 04:28:57,710 --> 04:28:59,938 now looks like this in\nthe computer's memory. 4680 04:28:59,939 --> 04:29:01,730 It doesn't change your\nfile, it's doing all 4681 04:29:01,730 --> 04:29:04,251 of this in the computer's\nmemory, or RAM, for you. 4682 04:29:04,251 --> 04:29:06,331 But it, essentially, looks like this. 4683 04:29:06,331 --> 04:29:09,861 Well the next step is what's,\ntechnically, really compiling. 4684 04:29:09,861 --> 04:29:12,681 Even though again, we use\ncompile as an umbrella term. 4685 04:29:12,681 --> 04:29:15,771 Compiling code in C\nmeans to take code that 4686 04:29:15,771 --> 04:29:18,001 now looks like this in\nthe computer's memory 4687 04:29:18,001 --> 04:29:21,150 and turn it into something\nthat looks like this. 4688 04:29:22,611 --> 04:29:25,251 But it was just a few\ndecades ago that, if you 4689 04:29:25,251 --> 04:29:28,191 were taking a class like\nCS50 in its earlier form 4690 04:29:28,191 --> 04:29:32,001 we wouldn't be using C it didn't exist\n 4691 04:29:32,001 --> 04:29:33,951 something called assembly language. 4692 04:29:33,950 --> 04:29:37,490 And there's different types of,\n 4693 04:29:37,490 --> 04:29:41,271 But this is about as low level as\n 4694 04:29:41,271 --> 04:29:43,671 understands, be it a\nMac, or PC, or a phone 4695 04:29:43,671 --> 04:29:46,911 before you start getting\ninto actual 0s and 1s. 4696 04:29:46,911 --> 04:29:48,273 And most of this is cryptic. 4697 04:29:48,273 --> 04:29:51,440 I couldn't tell you what this is doing\n 4698 04:29:51,441 --> 04:29:54,561 and rewound mentally, years\nago, from having studied it 4699 04:29:54,560 --> 04:29:57,140 but let's highlight a\nfew key words in yellow. 4700 04:29:57,140 --> 04:30:01,640 Notice that this assembly language\n 4701 04:30:01,640 --> 04:30:04,790 for you automatically,\nstill has mention of main 4702 04:30:04,790 --> 04:30:07,550 and it has mention of get string,\nand it has mention of printf. 4703 04:30:07,550 --> 04:30:10,618 So there's some relationship to\nthe C code we saw a moment ago. 4704 04:30:10,619 --> 04:30:12,411 And then if I highlight\nthese other things 4705 04:30:12,411 --> 04:30:14,691 these are what are called\ncomputer instructions. 4706 04:30:14,691 --> 04:30:17,001 At the end of the day,\nyour Mac, your PC 4707 04:30:17,001 --> 04:30:20,601 your phone actually only\nunderstands very basic instructions 4708 04:30:20,601 --> 04:30:25,281 like addition, subtraction, division,\n 4709 04:30:25,281 --> 04:30:30,451 load from memory, print something to\n 4710 04:30:30,450 --> 04:30:32,015 And that's what you're seeing here. 4711 04:30:32,015 --> 04:30:37,011 These assembly instructions\nare what the computer actually 4712 04:30:37,011 --> 04:30:41,131 feeds into the brains of the computer,\n 4713 04:30:41,130 --> 04:30:44,030 And it's that Intel CPU,\nor whatever you have 4714 04:30:44,031 --> 04:30:47,481 that understands this instruction, and\n 4715 04:30:47,480 --> 04:30:50,120 And collectively, long\nstory short, all they do 4716 04:30:50,120 --> 04:30:52,880 is print hello, world on\nthe screen, but in a way 4717 04:30:52,880 --> 04:30:56,171 that the machine understands how to do. 4718 04:30:58,761 --> 04:31:01,271 Are there any questions on\nwhat we mean by preprocessing? 4719 04:31:01,271 --> 04:31:05,111 Which finds and replaces the hash\n 4720 04:31:05,111 --> 04:31:08,711 and compiling, which technically\ntakes your source code 4721 04:31:08,710 --> 04:31:12,430 once preprocessed, and converts it to\n 4722 04:31:12,431 --> 04:31:14,603 AUDIENCE: [INAUDIBLE] each CPU has-- 4723 04:31:15,550 --> 04:31:18,970 Each type of CPU has\nits own instruction set. 4724 04:31:19,540 --> 04:31:23,230 And as a teaser, this is why,\nat least back in the day, when 4725 04:31:23,230 --> 04:31:27,161 we used to install software from\n 4726 04:31:27,161 --> 04:31:32,483 this is why you can't take a program\n 4727 04:31:32,483 --> 04:31:33,941 and run it on a Mac, or vice-versa. 4728 04:31:33,941 --> 04:31:38,681 Because the commands, the instructions\n 4729 04:31:39,761 --> 04:31:44,411 Now Microsoft, or any company, could\n 4730 04:31:44,411 --> 04:31:48,370 like C or another, and they can\n 4731 04:31:50,050 --> 04:31:54,369 It's twice as much work and sometimes\n 4732 04:31:54,370 --> 04:31:57,401 but that's why these steps\nare somewhat distinct. 4733 04:31:57,400 --> 04:32:00,970 You can now use the same code and\n 4734 04:32:03,911 --> 04:32:07,060 Thankfully, this part is fairly\n 4735 04:32:07,060 --> 04:32:10,511 To assemble code, which is step\nthree of four, that is just 4736 04:32:10,511 --> 04:32:14,621 happening for you every time\nyou run make or, in turn, clang 4737 04:32:14,620 --> 04:32:17,830 this assembly language, which the\n 4738 04:32:17,831 --> 04:32:21,341 for you from your source code,\nis turned into 0s and 1s. 4739 04:32:21,341 --> 04:32:25,043 So that's the step that, last\nweek, I simplified and said 4740 04:32:25,043 --> 04:32:28,210 when you compile your code, you convert\n 4741 04:32:29,230 --> 04:32:31,945 Technically, that happens\nwhen you assemble your code. 4742 04:32:31,945 --> 04:32:35,200 But no one in normal\nconversations says that, they just 4743 04:32:35,200 --> 04:32:37,540 say compile for all of these terms. 4744 04:32:43,331 --> 04:32:46,661 Even in this simple program\nof getting the user's name 4745 04:32:46,661 --> 04:32:51,380 and then plugging it into printf, I'm\n 4746 04:32:51,880 --> 04:32:54,460 My own, which is in hello.c. 4747 04:32:54,460 --> 04:32:59,860 Some of CS50s, which is\nin hello.c, sorry-- which 4748 04:32:59,861 --> 04:33:03,341 is in cs50.c, which is not\na file I've mentioned, yet 4749 04:33:03,341 --> 04:33:07,480 but it stands to reason, that if\n 4750 04:33:07,480 --> 04:33:09,640 turns out, the actual\nimplementation of get string 4751 04:33:09,640 --> 04:33:11,861 and other things are in cs50.c. 4752 04:33:11,861 --> 04:33:15,550 And there's a third file\nsomewhere on the hard drive 4753 04:33:15,550 --> 04:33:18,521 that's involved in compiling\neven this simple program. 4754 04:33:18,521 --> 04:33:24,231 hello.c, cs50.c, and by that\nlogic, what might the other be? 4755 04:33:27,861 --> 04:33:30,951 And that's a bit of a white lie,\n 4756 04:33:30,951 --> 04:33:34,011 that there's actually multiple files\n 4757 04:33:34,010 --> 04:33:35,640 and we'll take the simplification. 4758 04:33:35,640 --> 04:33:40,460 So when I have this code,\nand I compile my code 4759 04:33:40,460 --> 04:33:45,560 I get those 0s and 1s that end up taking\n 4760 04:33:45,561 --> 04:33:51,091 into 0s and 1s that are combined with\n 4761 04:33:52,100 --> 04:33:57,560 Here might be the 0s and 1s for my code,\n 4762 04:33:57,561 --> 04:34:02,181 Here might be the 0s and 1s for what\n 4763 04:34:02,181 --> 04:34:06,471 Here might be the 0s and 1s that someone\n 4764 04:34:06,471 --> 04:34:09,980 The last and final step\nis that linking command 4765 04:34:09,980 --> 04:34:12,591 that links all of these\n0s and 1s together 4766 04:34:12,591 --> 04:34:18,081 essentially stitches them together\n 4767 04:34:18,080 --> 04:34:20,645 or called a.out, whatever you name it. 4768 04:34:20,646 --> 04:34:25,911 That last step is what combines all of\n 4769 04:34:25,911 --> 04:34:28,311 And my God, now we're\nreally in the weeds. 4770 04:34:28,311 --> 04:34:31,281 Who wants to even think about\nrunning code at this level? 4771 04:34:33,440 --> 04:34:36,008 When you're running make,\nthere's some very concrete steps 4772 04:34:36,008 --> 04:34:38,550 that are happening that humans\nhave developed over the years 4773 04:34:38,550 --> 04:34:41,960 over the decades, that breakdown\n 4774 04:34:41,960 --> 04:34:46,670 to 0s and 1s, or machine code,\ninto these very specific steps. 4775 04:34:46,670 --> 04:34:50,360 But henceforth, you can\ncall all of this compiling. 4776 04:34:52,856 --> 04:34:55,064 AUDIENCE: Can you explain\nagain what a.out signifies? 4777 04:34:57,530 --> 04:35:02,150 a.out is just the conventional,\n 4778 04:35:02,151 --> 04:35:05,541 that you compile directly\nwith a compiler, like clang. 4779 04:35:05,541 --> 04:35:07,941 It's a meaningless name, though. 4780 04:35:07,940 --> 04:35:11,510 It stands for assembler output, and\n 4781 04:35:11,510 --> 04:35:12,950 from this assembling process. 4782 04:35:12,951 --> 04:35:15,411 It's a lame name for a\ncomputer program, and we 4783 04:35:15,411 --> 04:35:20,710 can override it by outputting\nsomething like hello, instead. 4784 04:35:27,687 --> 04:35:32,121 DAVID MALAN: To recap, there are\n 4785 04:35:32,120 --> 04:35:36,170 cs50.h, stdio.h, technically, they're\n 4786 04:35:36,170 --> 04:35:38,720 even though you, strictly\nspeaking, don't need most of them 4787 04:35:38,721 --> 04:35:42,451 but they are there, just in\ncase you might want them. 4788 04:35:42,451 --> 04:35:43,921 And finally, any other questions? 4789 04:35:48,138 --> 04:35:51,181 DAVID MALAN: Does it matter what order\n 4790 04:35:51,181 --> 04:35:53,401 Sometimes with libraries,\nyes, it matters 4791 04:35:53,401 --> 04:35:55,781 what order they are linked in together. 4792 04:35:55,780 --> 04:35:58,591 But for our purposes, it's\nreally not going to matter. 4793 04:35:58,591 --> 04:36:03,010 It's going to-- make is going to take\n 4794 04:36:03,510 --> 04:36:06,055 So with that said, henceforth,\ncompiling, technically 4795 04:36:06,931 --> 04:36:10,951 But we'll focus on it as a higher\nlevel concept, an abstraction 4796 04:36:14,140 --> 04:36:16,771 So another process that we'll\nnow begin to focus on all the 4797 04:36:16,771 --> 04:36:19,951 more this week because, invariably,\n 4798 04:36:19,951 --> 04:36:21,421 ran up against some challenges. 4799 04:36:21,420 --> 04:36:24,810 You probably created your very first\n 4800 04:36:24,811 --> 04:36:28,201 and so let's focus for a moment on\n 4801 04:36:28,201 --> 04:36:31,320 As you spend more time\nthis semester, in the years 4802 04:36:31,320 --> 04:36:34,530 to come If you continue to program,\n 4803 04:36:34,530 --> 04:36:37,837 going to write bug\nfree code, ultimately. 4804 04:36:37,837 --> 04:36:40,920 Though your programs are going to get\n 4805 04:36:40,920 --> 04:36:44,490 and we're all going to start to\n 4806 04:36:44,491 --> 04:36:46,831 And to this day, I write\nbuggy code all the time. 4807 04:36:46,830 --> 04:36:48,780 And I'm always horrified\nwhen I do it up here. 4808 04:36:48,780 --> 04:36:50,880 But hopefully, that\nwon't happen too often. 4809 04:36:50,881 --> 04:36:54,361 But when it does, it's a process,\nnow, of debugging, trying 4810 04:36:54,361 --> 04:36:56,491 to find the mistakes in your program. 4811 04:36:56,491 --> 04:36:59,861 You don't have to stare at your code,\n 4812 04:36:59,861 --> 04:37:02,851 There are actual tools\nthat real world programmers 4813 04:37:02,850 --> 04:37:06,120 use to help debug their\ncode and find these faults. 4814 04:37:06,120 --> 04:37:08,716 So what are some of the techniques\nand tools that folks use? 4815 04:37:08,716 --> 04:37:13,701 Well as an aside, if you've ever-- 4816 04:37:13,701 --> 04:37:17,101 a bug in a program is a mistake,\n 4817 04:37:17,100 --> 04:37:22,270 If you've ever heard this tale,\nsome 50 plus years ago, in 1947. 4818 04:37:22,271 --> 04:37:27,030 This is an entry in a log book written\n 4819 04:37:27,030 --> 04:37:29,490 as-- named Grace Hopper,\nwho happened to be the one 4820 04:37:29,491 --> 04:37:33,606 to record the very first discovery of a\n 4821 04:37:33,605 --> 04:37:36,120 This was like a moth\nthat had flown into 4822 04:37:36,120 --> 04:37:41,341 at the time, a very sophisticated system\n 4823 04:37:41,341 --> 04:37:44,311 very large, refrigerator-sized\ntype systems 4824 04:37:44,311 --> 04:37:48,421 in which an actual bug caused an issue. 4825 04:37:48,420 --> 04:37:51,450 The etymology of bug though,\npredates this particular instance 4826 04:37:51,451 --> 04:37:54,841 but here you have, as any computer\n 4827 04:37:54,841 --> 04:37:57,105 of a first physical bug in a computer. 4828 04:37:57,105 --> 04:37:59,582 How, though, do you go\nabout removing such a thing? 4829 04:37:59,582 --> 04:38:02,040 Well, let's consider a very\nsimple scenario from last time 4830 04:38:02,041 --> 04:38:05,041 for instance, when we were trying to\n 4831 04:38:05,041 --> 04:38:07,230 like this column of 3 bricks. 4832 04:38:07,230 --> 04:38:10,920 Let's consider how I might go about\n 4833 04:38:10,920 --> 04:38:15,390 Let me switch back over to VS\nCode here, and I'm going to run-- 4834 04:38:17,010 --> 04:38:18,900 And I'm not going to\ntrust myself, so I'm 4835 04:38:18,901 --> 04:38:20,768 going to call it\nbuggy.c from the get-go 4836 04:38:20,768 --> 04:38:22,600 knowing that I'm going\nto mess something up. 4837 04:38:22,600 --> 04:38:25,411 But I'm going to go ahead\nand include stdio.h. 4838 04:38:25,411 --> 04:38:28,201 And I'm going to define main, as usual. 4839 04:38:28,201 --> 04:38:30,210 So hopefully, no mistakes just yet. 4840 04:38:30,210 --> 04:38:32,970 And now, I want to print those\n3 bricks on the screen using 4841 04:38:34,530 --> 04:38:40,681 So how about 4 int i get 0, i less\n 4842 04:38:40,681 --> 04:38:42,541 Now, inside of my\ncurly braces, I'm going 4843 04:38:42,541 --> 04:38:48,221 to go ahead and print out a hash\n 4844 04:38:48,221 --> 04:38:52,236 All right, saving the file, doing\n 4845 04:38:52,236 --> 04:38:57,601 So there's no syntactical errors,\n 4846 04:38:57,600 --> 04:39:00,900 But some of you have probably\nseen the logical error already 4847 04:39:00,901 --> 04:39:03,631 because when I run this\nprogram I don't get 4848 04:39:03,631 --> 04:39:09,691 this picture, which was 3 bricks\n 4849 04:39:09,690 --> 04:39:12,190 Now, this might be jumping out\nat you, why it's happening 4850 04:39:12,190 --> 04:39:14,190 but I've kept the program\nsimple just so that we 4851 04:39:14,190 --> 04:39:18,271 don't have to find an actual bug, we can\n 4852 04:39:20,230 --> 04:39:23,311 What might be the first strategy\nfor finding a bug like this 4853 04:39:23,311 --> 04:39:27,552 rather than staring at your code,\n 4854 04:39:28,385 --> 04:39:31,950 Well, let's actually try to diagnose\n 4855 04:39:31,951 --> 04:39:34,681 And the simplest way to do\nthis now, and years from now 4856 04:39:34,681 --> 04:39:38,131 is, honestly, going to be to\nuse a function like printf. 4857 04:39:38,131 --> 04:39:40,050 Printf is a wonderfully\nuseful function, not 4858 04:39:40,050 --> 04:39:42,810 for formatting-- printing\nformatted strings and all that, for 4859 04:39:42,811 --> 04:39:45,691 just looking inside\nthe values of variables 4860 04:39:45,690 --> 04:39:48,612 that you might be curious\nabout to see what's going on. 4861 04:39:50,580 --> 04:39:53,370 I see that there's 4 coming\nout, but I intended 3. 4862 04:39:53,370 --> 04:39:56,001 So clearly, something's\nwrong with my i variables. 4863 04:39:56,001 --> 04:39:58,350 So let me be a little more pedantic. 4864 04:39:58,350 --> 04:40:01,560 Let me go inside of this\nloop and, temporarily 4865 04:40:01,561 --> 04:40:04,741 say something explicit, like, i is-- 4866 04:40:04,741 --> 04:40:09,461 &i /n, and then just\nplug in the value of i. 4867 04:40:09,960 --> 04:40:13,230 This is not the program I want to\n 4868 04:40:13,230 --> 04:40:18,661 writing, because now I'm going\nto say make buggy, ./buggy. 4869 04:40:18,661 --> 04:40:20,760 And if I look, now,\nat the output, I have 4870 04:40:20,760 --> 04:40:25,350 some helpful diagnostic information.\n 4871 04:40:25,350 --> 04:40:27,870 and I get a hash, 2 and I\nget a hash, 3 and I get hash. 4872 04:40:28,788 --> 04:40:30,870 I'm clearly going too many\nsteps because, maybe, I 4873 04:40:30,870 --> 04:40:33,510 forgot that computers are,\nessentially, counting from 0 4874 04:40:33,510 --> 04:40:35,710 and now, oh, it's less than or equal to. 4875 04:40:37,291 --> 04:40:40,201 Again, trivial example,\nbut just by using printf 4876 04:40:40,201 --> 04:40:43,171 you can see inside of\nthe computer's memory 4877 04:40:43,170 --> 04:40:45,390 by just printing stuff out like this. 4878 04:40:45,390 --> 04:40:50,030 And now, once you've figured it out, oh,\n 4879 04:40:50,030 --> 04:40:52,400 or I should start\ncounting from 1, there's 4880 04:40:52,401 --> 04:40:53,901 any number of ways I could fix this. 4881 04:40:53,901 --> 04:40:56,916 But the most conventional is\nprobably just to say less than 3. 4882 04:40:56,916 --> 04:41:03,441 Now, I can delete my temporary print\n 4883 04:41:06,050 --> 04:41:08,091 All right, and to this day, I do this. 4884 04:41:08,091 --> 04:41:11,120 Whether it's making a command line\n 4885 04:41:11,120 --> 04:41:13,310 or mobile application,\nIt's very common to use 4886 04:41:13,311 --> 04:41:15,531 printf, or some equivalent\nin any language 4887 04:41:15,530 --> 04:41:19,611 just to poke around and see what's\ninside the computer's memory. 4888 04:41:19,611 --> 04:41:22,831 Thankfully, there's more\nsophisticated tools than this. 4889 04:41:22,830 --> 04:41:25,190 Let me go ahead and\nreintroduce the bug here. 4890 04:41:25,190 --> 04:41:28,880 And let me reopen my\nsidebar at left here. 4891 04:41:28,881 --> 04:41:32,811 Let me now recompile the code\nto make sure it's current. 4892 04:41:32,811 --> 04:41:35,570 And I'm going to run a\ncommand called debug50. 4893 04:41:35,570 --> 04:41:39,350 Which is a command that's\nrepresentative of a type of program 4894 04:41:41,001 --> 04:41:43,940 And this debugger is\nactually built into VS Code. 4895 04:41:43,940 --> 04:41:47,960 And all debug50 is doing for us is\n 4896 04:41:47,960 --> 04:41:49,911 VS Code's built-in debugger. 4897 04:41:49,911 --> 04:41:52,521 So this isn't even a\nCS50-specific tool, we've 4898 04:41:52,521 --> 04:41:55,431 just given you a debug50\ncommand to make it easier 4899 04:41:55,431 --> 04:41:57,116 to start it up from the get-go. 4900 04:41:57,116 --> 04:42:01,821 And the way you run this debugger\n 4901 04:42:01,820 --> 04:42:04,381 the name of the program\nthat you want to debug. 4902 04:42:06,471 --> 04:42:08,271 So you don't mention your c-file. 4903 04:42:08,271 --> 04:42:10,911 You mention your already-compiled code. 4904 04:42:10,911 --> 04:42:16,491 And what this debugger is going\n 4905 04:42:16,491 --> 04:42:19,191 walk through my code step-by-step. 4906 04:42:19,190 --> 04:42:23,190 Because every program we've written\n 4907 04:42:23,190 --> 04:42:26,585 even if I'm not done thinking\nthrough each step at a time. 4908 04:42:26,585 --> 04:42:30,111 With a debugger, I can\nactually click on a line number 4909 04:42:30,111 --> 04:42:33,441 and say pause execution\nhere, and the debugger 4910 04:42:33,440 --> 04:42:38,390 will let me walk through my code one\n 4911 04:42:38,390 --> 04:42:41,001 one minute at a time,\nat my own human pace. 4912 04:42:41,001 --> 04:42:43,730 Which is super compelling when\nthe programs get more complicated 4913 04:42:43,730 --> 04:42:46,861 and they might, otherwise,\nfly by on the screen. 4914 04:42:46,861 --> 04:42:50,120 So I'm going to click\nto the left of line 5. 4915 04:42:50,120 --> 04:42:52,230 And notice that these\nlittle red dots appear. 4916 04:42:52,230 --> 04:42:55,550 And if I click on one it\nstays, and gets even redder. 4917 04:42:55,550 --> 04:42:58,490 And I'm going to run debug50 on ./buggy. 4918 04:42:58,491 --> 04:43:03,351 And in just a moment, you'll see that a\n 4919 04:43:03,350 --> 04:43:06,170 It's doing some\nconfiguration of the screen. 4920 04:43:06,170 --> 04:43:10,950 Let me zoom out a little bit here so\n 4921 04:43:10,951 --> 04:43:14,701 And sometimes, you'll see in VS\n 4922 04:43:14,701 --> 04:43:18,741 which looks very cryptic, just go back\n 4923 04:43:18,741 --> 04:43:22,136 Because at the terminal window is where\n 4924 04:43:22,135 --> 04:43:24,380 And let's now take a\nlook at what's going on. 4925 04:43:24,381 --> 04:43:28,911 If I zoom in on my\nbuggy.c code here, you'll 4926 04:43:28,911 --> 04:43:35,151 notice that we have the same program\n 4927 04:43:36,080 --> 04:43:39,920 Not a coincidence, that's the line\n 4928 04:43:39,920 --> 04:43:44,661 The little red dot means break\nhere, pause execution here. 4929 04:43:44,661 --> 04:43:47,976 And the yellow line has\nnot yet been executed. 4930 04:43:47,976 --> 04:43:51,860 But if I, now, at the top of my\n 4931 04:43:53,010 --> 04:43:55,010 There's one for this,\nwhich, if I hover over it 4932 04:43:55,010 --> 04:43:58,400 says Step Over, there's another\nthat's going to say Step Into 4933 04:43:58,401 --> 04:44:00,081 there's a third that says Step Out. 4934 04:44:00,080 --> 04:44:02,780 I'm just going to use the\nfirst of these, Step Over. 4935 04:44:02,780 --> 04:44:05,841 And I'm going to do this, and\n 4936 04:44:05,841 --> 04:44:09,921 moved from line 5 to line\n7 because now it's ready 4937 04:44:09,920 --> 04:44:12,215 but hasn't yet printed out that hash. 4938 04:44:12,216 --> 04:44:16,078 But the most powerful thing here,\nnotice, is that top left here. 4939 04:44:16,078 --> 04:44:18,411 It's a little cryptic, because\nthere's a bunch of things 4940 04:44:18,411 --> 04:44:21,170 going on that will make more\nsense over time, but at the top 4941 04:44:21,170 --> 04:44:22,730 there's a section called variables. 4942 04:44:22,730 --> 04:44:25,010 Below that, something\ncalled locals, which means 4943 04:44:25,010 --> 04:44:27,080 local to my current function, main. 4944 04:44:27,080 --> 04:44:31,670 And notice, there's my variable\n 4945 04:44:31,670 --> 04:44:37,070 So now, once I click Step Over\nagain, watch what happens. 4946 04:44:37,070 --> 04:44:39,920 We go from line 7 back to line 5. 4947 04:44:39,920 --> 04:44:43,715 But look in the terminal window,\none of the hashes has printed. 4948 04:44:43,716 --> 04:44:46,311 But now, it's printed at my own pace. 4949 04:44:46,311 --> 04:44:48,291 I can think through this step-by-step. 4950 04:44:48,291 --> 04:44:50,601 Notice that i has not changed, yet. 4951 04:44:50,600 --> 04:44:53,960 It's still 0 because the yellow\n 4952 04:44:53,960 --> 04:44:58,400 But the moment I click Step Over,\nit's going to execute line 5. 4953 04:44:58,401 --> 04:45:05,271 Now, notice at top left, i has become\n 4954 04:45:05,271 --> 04:45:07,550 because now, highlighted is line 7. 4955 04:45:07,550 --> 04:45:12,260 So if I click Step Over\nagain, we'll see the hash. 4956 04:45:12,260 --> 04:45:16,190 If I repeat this process at my\nown human, comfortable pace 4957 04:45:16,190 --> 04:45:21,300 I can see my variables changing, I\n 4958 04:45:21,300 --> 04:45:24,163 and I can just think about\nshould that have just happened. 4959 04:45:24,163 --> 04:45:26,120 I can pause and give\nthought to what's actually 4960 04:45:26,120 --> 04:45:30,501 going on without trying to race the\n 4961 04:45:30,501 --> 04:45:32,751 I'm going to go ahead and\nstop here because we already 4962 04:45:32,751 --> 04:45:35,690 know what this particular problem\nis, and that brings me back 4963 04:45:35,690 --> 04:45:36,980 to my default terminal window. 4964 04:45:36,980 --> 04:45:40,440 But this debugger, let me\ndisable the breakpoint now 4965 04:45:40,440 --> 04:45:42,830 so it doesn't keep\nbreaking, this debugger 4966 04:45:42,830 --> 04:45:45,020 will be your friend\nmoving forward in order 4967 04:45:45,021 --> 04:45:49,550 to step through your code step-by-step,\n 4968 04:45:49,550 --> 04:45:51,080 where something has gone wrong. 4969 04:45:51,080 --> 04:45:54,657 Printf is great, but it gets annoying if\n 4970 04:45:54,657 --> 04:45:57,740 print this, print this, print this,\n 4971 04:45:59,241 --> 04:46:04,041 The debugger lets you do the\nequivalent, but automatically. 4972 04:46:04,041 --> 04:46:10,221 Questions on this debugger, which you'll\n 4973 04:46:12,814 --> 04:46:14,820 AUDIENCE: You were using\na Step Over feature. 4974 04:46:14,820 --> 04:46:17,563 What do the other\nfeatures in the debugger-- 4975 04:46:17,563 --> 04:46:18,980 DAVID MALAN: Really good question. 4976 04:46:18,980 --> 04:46:21,980 We'll see this before long, but those\n 4977 04:46:21,980 --> 04:46:26,721 step into and step out of, actually\n 4978 04:46:26,721 --> 04:46:28,461 if I had any more than main. 4979 04:46:28,460 --> 04:46:31,220 So if main called a\nfunction called something 4980 04:46:31,221 --> 04:46:34,640 and something called a function\n 4981 04:46:34,640 --> 04:46:38,990 stepping over the entire execution of\n 4982 04:46:38,991 --> 04:46:41,366 and walk through its\nlines of code one by one. 4983 04:46:41,366 --> 04:46:43,281 So any time you have\na problem set you're 4984 04:46:43,280 --> 04:46:46,400 working on that has multiple functions,\n 4985 04:46:46,401 --> 04:46:50,511 if you want, or you can set it inside\n 4986 04:46:50,510 --> 04:46:53,390 to focus your attention only on that. 4987 04:46:53,390 --> 04:46:56,900 And we'll see examples\nof that over time. 4988 04:46:58,041 --> 04:47:02,361 And what's the sort of, elephant\nin the room, so to speak 4989 04:47:02,361 --> 04:47:04,011 is actually a duck in this case. 4990 04:47:04,010 --> 04:47:06,420 Why is there this duck and\nall of these ducks here? 4991 04:47:06,420 --> 04:47:10,700 Well, it turns out, a third, genuinely\n 4992 04:47:10,701 --> 04:47:14,316 is talking through problems, talking\n 4993 04:47:14,315 --> 04:47:16,880 Now, in the absence of having\na family member, or a friend 4994 04:47:16,881 --> 04:47:20,781 or a roommate who actually wants to\n 4995 04:47:20,780 --> 04:47:25,580 generally, programmers turn to a\n 4996 04:47:25,580 --> 04:47:27,620 if something animate is not available. 4997 04:47:27,620 --> 04:47:31,021 The idea behind rubber duck\ndebugging, so to speak 4998 04:47:31,021 --> 04:47:37,011 is that simply by looking at your code\n 4999 04:47:37,010 --> 04:47:41,300 I'm starting a 4 loop and\nI'm initializing i to 0. 5000 04:47:41,300 --> 04:47:43,251 OK, then, I'm printing out a hash. 5001 04:47:43,251 --> 04:47:48,372 Just by talking through your\ncode, step-by-step, invariably 5002 04:47:48,372 --> 04:47:51,080 finds you having the proverbial\n 5003 04:47:51,080 --> 04:47:53,300 because you realize, wait a minute\nI just said something stupid 5004 04:47:53,300 --> 04:47:54,771 or I just said something wrong. 5005 04:47:54,771 --> 04:47:58,761 And this is really just a proxy for any\n 5006 04:48:00,320 --> 04:48:02,701 But in the absence of any\nof those people in the room 5007 04:48:02,701 --> 04:48:04,618 you're welcome to take,\non your way out today. 5008 04:48:04,617 --> 04:48:08,540 One of these little, rubber ducks and\n 5009 04:48:08,541 --> 04:48:12,081 you want to talk through one\nof your problems in CS50 5010 04:48:12,080 --> 04:48:13,400 or maybe life more generally. 5011 04:48:13,401 --> 04:48:15,741 But having it there on\nyour desk is just a way 5012 04:48:15,741 --> 04:48:19,401 to help you hear illogic\nin what you think 5013 04:48:19,401 --> 04:48:22,051 might, otherwise, be logical code. 5014 04:48:22,050 --> 04:48:26,661 So printf, debugging, rubber-duck\n 5015 04:48:26,661 --> 04:48:29,468 you'll see over time, to\nget to the source of code 5016 04:48:29,468 --> 04:48:31,050 that you will write that has mistakes. 5017 04:48:31,050 --> 04:48:33,140 Which is going to happen,\nbut it will empower you 5018 04:48:33,140 --> 04:48:36,260 all the more to solve those mistakes. 5019 04:48:36,260 --> 04:48:41,700 All right, any questions on debugging,\n 5020 04:48:44,001 --> 04:48:46,911 DAVID MALAN: What's the difference\n 5021 04:48:46,911 --> 04:48:50,241 At the moment, the only one that's\n 5022 04:48:50,241 --> 04:48:53,601 is Step Over, because it means\nstep over each line of code. 5023 04:48:53,600 --> 04:48:58,310 If, though, I had other functions\n 5024 04:48:58,311 --> 04:49:03,561 maybe lower down in the file, I\n 5025 04:49:03,561 --> 04:49:05,730 and walk through them one at a time. 5026 04:49:05,729 --> 04:49:07,911 So we'll come back to this\nwith an actual example 5027 04:49:07,911 --> 04:49:10,491 but step into will allow\nme to do exactly that. 5028 04:49:10,491 --> 04:49:13,471 In fact, this is a perfect segue to\n 5029 04:49:13,471 --> 04:49:15,893 Let me go ahead and open\nup another file here. 5030 04:49:15,893 --> 04:49:17,600 And, actually, we'll\nuse the same, buggy. 5031 04:49:17,600 --> 04:49:20,580 And we're going to write one\nother thing that's buggy, as well. 5032 04:49:20,580 --> 04:49:24,260 Let me go up here and\ninclude, as before, cs50.h. 5033 04:49:29,780 --> 04:49:32,310 So all of this, I think,\nis correct, so far. 5034 04:49:32,311 --> 04:49:35,541 And let's do this, let's\ngive myself an int called i 5035 04:49:35,541 --> 04:49:38,791 and let's ask the user\nfor a negative integer. 5036 04:49:38,791 --> 04:49:41,561 This is not a function that\nexists, technically, yet. 5037 04:49:41,561 --> 04:49:44,311 But I'm going to assume, for the\n 5038 04:49:44,311 --> 04:49:47,961 Then, I'm just going to print\nout, with %i and a new line 5039 04:49:47,960 --> 04:49:49,620 whatever the human typed in. 5040 04:49:49,620 --> 04:49:52,580 So at this point in the story,\nmy program, I think, is correct. 5041 04:49:52,580 --> 04:49:55,190 Except for the fact that\nget negative int is not 5042 04:49:55,190 --> 04:49:57,950 a function in the CS50\nlibrary or anywhere else. 5043 04:49:57,951 --> 04:49:59,721 I'm going to need to invent it myself. 5044 04:49:59,721 --> 04:50:05,570 So suppose, in this case, that I declare\n 5045 04:50:05,570 --> 04:50:09,890 It's return type, so to speak, should\n 5046 04:50:09,890 --> 04:50:12,620 I want to hand the user back\nin integer, and it's going 5047 04:50:12,620 --> 04:50:14,570 to take no input to keep it simple. 5048 04:50:14,570 --> 04:50:16,070 So I'm just going to say void there. 5049 04:50:16,070 --> 04:50:19,070 No inputs, no special\nprompts, nothing like that. 5050 04:50:19,070 --> 04:50:21,861 Let me, now, give myself\nsome curly braces. 5051 04:50:21,861 --> 04:50:24,771 And let me do something familiar,\nperhaps, from problem set 1. 5052 04:50:24,771 --> 04:50:29,811 Let me give myself a variable,\n 5053 04:50:31,580 --> 04:50:37,850 Assign n the value of get int, asking\n 5054 04:50:39,111 --> 04:50:43,011 And I want to do this while\nn is less than 0, because I 5055 04:50:43,010 --> 04:50:44,650 want to get a negative from the user. 5056 04:50:44,651 --> 04:50:48,401 And recall, from having\nused this block in the past 5057 04:50:48,401 --> 04:50:52,031 I can now return n as the\nvery last step to hand back 5058 04:50:52,030 --> 04:50:56,050 whatever the user has typed in, so\n 5059 04:50:58,010 --> 04:51:00,970 Now, I've deliberately\nmade a mistake here 5060 04:51:00,971 --> 04:51:03,341 and it's a subtle,\nsilly, mathematical one 5061 04:51:03,341 --> 04:51:08,171 but let me compile this program after\n 5062 04:51:08,170 --> 04:51:09,640 so I don't make that mistake again. 5063 04:51:09,640 --> 04:51:12,730 Let me do make buggy, Enter. 5064 04:51:14,980 --> 04:51:18,280 I'll give it a negative\ninteger, like negative 50. 5065 04:51:29,260 --> 04:51:33,341 So it's, clearly, working backwards,\n 5066 04:51:33,341 --> 04:51:35,061 So how could I go about debugging this? 5067 04:51:35,061 --> 04:51:36,686 Well, I could do what I've done before? 5068 04:51:36,686 --> 04:51:43,181 I could use my printf technique and\n 5069 04:51:43,181 --> 04:51:49,570 new line, comma n, just to print\nit out, let me recompile buggy 5070 04:51:49,570 --> 04:51:52,901 let me rerun buggy, let\nme type in negative 50. 5071 04:51:54,890 --> 04:51:57,434 So that didn't really\nhelp me at this point 5072 04:51:57,434 --> 04:51:58,851 because that's the same as before. 5073 04:51:58,850 --> 04:52:02,290 So let me do this, debug50, ./buggy. 5074 04:52:02,291 --> 04:52:04,131 Oh, but I've made a mistake. 5075 04:52:04,131 --> 04:52:05,961 So I didn't set my breakpoint, yet. 5076 04:52:05,960 --> 04:52:09,190 So let me do this, and I'll\nset a breakpoint this time. 5077 04:52:09,190 --> 04:52:11,591 I could set it here, on line 8. 5078 04:52:11,591 --> 04:52:13,600 Let's do it in main, as before. 5079 04:52:17,230 --> 04:52:19,451 That fancy user interface\nis going to pop up. 5080 04:52:19,451 --> 04:52:22,570 It's going to highlight the line\nthat I set the breakpoint on. 5081 04:52:22,570 --> 04:52:25,510 Notice that, on the left\nhand side of the screen 5082 04:52:25,510 --> 04:52:28,911 i is defaulting, at the moment to 0,\n 5083 04:52:29,411 --> 04:52:35,076 But let me, now, Step Over this\n 5084 04:52:35,076 --> 04:52:36,701 and you'll see that I'm being prompted. 5085 04:52:36,701 --> 04:52:40,480 So let's type in my negative 50, Enter. 5086 04:52:40,480 --> 04:52:45,730 Notice now that I'm\nstuck in that function. 5087 04:52:46,510 --> 04:52:50,780 So clearly, the issue seems to be\n 5088 04:52:50,780 --> 04:52:54,380 So, OK, let me stop this execution. 5089 04:52:54,381 --> 04:52:57,436 My problem doesn't seem to be in\n 5090 04:52:58,061 --> 04:53:00,251 Let me set my same breakpoint at line 8. 5091 04:53:00,251 --> 04:53:02,771 Let me rerun debug50 one more time. 5092 04:53:02,771 --> 04:53:07,370 But this time, instead of just stepping\n 5093 04:53:07,370 --> 04:53:09,670 So notice line 8 is, again,\nhighlighted in yellow. 5094 04:53:09,670 --> 04:53:11,950 In the past I've been\nclicking Step Over. 5095 04:53:14,440 --> 04:53:17,740 When I click Step Into,\nboom, now, the debugger 5096 04:53:17,741 --> 04:53:20,651 jumps into that specific function. 5097 04:53:20,651 --> 04:53:23,591 Now, I can step through these\nlines of code, again and again. 5098 04:53:23,591 --> 04:53:25,960 I can see what the value of\nn is as I'm typing it in. 5099 04:53:25,960 --> 04:53:27,760 I can think through my logic, and voila. 5100 04:53:27,760 --> 04:53:31,900 Hopefully, once I've solved the issue,\n 5101 04:53:33,440 --> 04:53:36,310 So Step Over just goes over\nthe line, but executes it 5102 04:53:36,311 --> 04:53:41,471 Step Into lets you go into\nother functions you've written. 5103 04:53:41,471 --> 04:53:43,661 So let's go ahead and do this. 5104 04:53:43,661 --> 04:53:47,811 We've got a bunch of\npossible approaches that we 5105 04:53:47,811 --> 04:53:49,811 can take to solving some\nproblems let's go ahead 5106 04:53:49,811 --> 04:53:50,991 and pace ourselves today, though. 5107 04:53:50,991 --> 04:53:52,161 Let's take a five-minute break, here. 5108 04:53:52,161 --> 04:53:54,949 And when we come back, we'll take\n 5109 04:54:05,260 --> 04:54:11,120 Up until now, both, by way of week 1\n 5110 04:54:11,120 --> 04:54:14,920 we've just translated from Scratch into\n 5111 04:54:14,920 --> 04:54:17,960 like loops and conditionals,\nBoolean expressions, variables. 5112 04:54:17,960 --> 04:54:19,210 So sort of, more of the same. 5113 04:54:19,210 --> 04:54:22,690 But there are features in C that\n 5114 04:54:22,690 --> 04:54:26,560 like data types, the types of variables\n 5115 04:54:26,561 --> 04:54:28,711 but that, in fact, does\nexist in other languages. 5116 04:54:28,710 --> 04:54:30,460 In fact, a few that\nwe'll see before long. 5117 04:54:30,460 --> 04:54:34,931 So to summarize the types we saw last\n 5118 04:54:34,931 --> 04:54:39,311 We had ints, and floats, and\nlongs, and doubles, and chars 5119 04:54:39,311 --> 04:54:42,771 there's also Booles and also string,\n 5120 04:54:42,771 --> 04:54:46,091 But today, let's actually start to\n 5121 04:54:46,091 --> 04:54:50,021 and actually what your Mac and PC\n 5122 04:54:50,021 --> 04:54:53,431 as an int versus a char, versus\na string, versus something else. 5123 04:54:53,431 --> 04:54:56,181 And see if we can't put more tools\n 5124 04:54:56,181 --> 04:54:59,890 so we can start quickly writing\n 5125 04:55:01,061 --> 04:55:04,901 So it turns out, that on\nmost systems nowadays 5126 04:55:04,901 --> 04:55:07,271 though this can vary by\nactual computer, this 5127 04:55:07,271 --> 04:55:10,300 is how large each of the\ndata types, typically 5128 04:55:10,300 --> 04:55:15,850 is in C. When you store a Boolean value,\n 5129 04:55:17,111 --> 04:55:19,361 That's a little excessive,\nbecause, strictly speaking 5130 04:55:19,361 --> 04:55:22,841 you only need 1 bit,\nwhich is 1/8 of this size. 5131 04:55:22,841 --> 04:55:25,451 But for simplicity,\ncomputers use a whole byte 5132 04:55:25,451 --> 04:55:28,001 to represent a Boole, true or false. 5133 04:55:28,001 --> 04:55:32,300 A char, we saw last week,\nis only 1 byte, or 8 bits. 5134 04:55:32,300 --> 04:55:37,210 And this is why ASCII, which uses 1\n 5135 04:55:37,210 --> 04:55:41,861 on, was confined to only 256\nmaximally possible characters. 5136 04:55:41,861 --> 04:55:46,201 Notice that an int is\n4 bytes, or 32 bits. 5137 04:55:46,201 --> 04:55:48,841 A float is also 4 bytes or 32 bits. 5138 04:55:48,841 --> 04:55:52,111 But the things that we call long,\n 5139 04:55:54,690 --> 04:55:58,161 A double is 64 bits of precision\nfor floating point values. 5140 04:55:58,161 --> 04:56:01,475 And a string, for today, we're\n 5141 04:56:01,475 --> 04:56:03,600 We'll come back to that,\nlater today and next week 5142 04:56:03,600 --> 04:56:06,780 as to how much space a string\ntakes up, but, suffice it to say 5143 04:56:06,780 --> 04:56:09,748 it's going to take up a\nvariable amount of space 5144 04:56:09,748 --> 04:56:11,791 depending on whether the\nstring is short or long. 5145 04:56:11,791 --> 04:56:14,730 But we'll see exactly what\nthat means, before long. 5146 04:56:14,730 --> 04:56:19,291 So here's a photograph of\na typical piece of memory 5147 04:56:19,291 --> 04:56:22,021 inside of your Mac, or PC, or phone. 5148 04:56:22,021 --> 04:56:24,421 Odds are, it might be a little\nsmaller in some devices. 5149 04:56:24,420 --> 04:56:27,210 This is known as RAM,\nor random access memory. 5150 04:56:27,210 --> 04:56:29,670 Each of these little black\nchips on this circuit 5151 04:56:29,670 --> 04:56:31,980 board, the green thing,\nthese little black chips 5152 04:56:31,980 --> 04:56:34,890 are where 0s and 1s are actually stored. 5153 04:56:34,890 --> 04:56:36,931 Each of those stores\nsome number of bytes. 5154 04:56:36,931 --> 04:56:39,390 Maybe megabytes, maybe\neven gigabytes, nowadays. 5155 04:56:39,390 --> 04:56:45,690 So let's focus on one of those chips,\n 5156 04:56:45,690 --> 04:56:49,650 Let's consider the fact that, even\n 5157 04:56:49,651 --> 04:56:53,731 how this kind of thing is made, if\n 5158 04:56:53,730 --> 04:56:56,190 for the sake of discussion,\nit stands to reason that 5159 04:56:56,190 --> 04:57:00,091 if this thing is storing 1\nbillion bytes, 1 gigabyte 5160 04:57:00,091 --> 04:57:02,370 then we can number them, arbitrarily. 5161 04:57:02,370 --> 04:57:05,850 Maybe this will be byte\n0, 1, 2, 3, 4, 5, 6, 7, 8. 5162 04:57:05,850 --> 04:57:09,260 Then, maybe, way down here in the bottom\n 5163 04:57:09,260 --> 04:57:13,020 We can just number these things,\nas might be our convention. 5164 04:57:13,021 --> 04:57:14,971 Let's draw that graphically. 5165 04:57:14,971 --> 04:57:17,351 Not with a billion squares,\nbut fewer than those. 5166 04:57:17,350 --> 04:57:19,670 And let's zoom in further,\nand consider that. 5167 04:57:19,670 --> 04:57:21,420 At this point in the\nstory, let's abstract 5168 04:57:21,420 --> 04:57:23,640 away all the hardware,\nand all the little wires 5169 04:57:23,640 --> 04:57:27,990 and just think of memory as taking\n 5170 04:57:27,991 --> 04:57:30,431 as taking up some number of bytes. 5171 04:57:30,431 --> 04:57:34,081 So, for instance, if you were to store\n 5172 04:57:34,080 --> 04:57:38,490 was 1 byte, it might be stored\nat this top left-hand location 5173 04:57:38,491 --> 04:57:40,456 of this black chip of memory. 5174 04:57:40,455 --> 04:57:44,550 If you were to store something like\n 5175 04:57:44,550 --> 04:57:47,820 it might use four of those bytes,\n 5176 04:57:47,820 --> 04:57:49,480 back-to-back-to-back, in this case. 5177 04:57:49,480 --> 04:57:53,530 If you were to store a long or a double,\n 5178 04:57:53,530 --> 04:57:55,650 So I'm filling in these\nsquares to represent 5179 04:57:55,651 --> 04:58:00,421 how much memory and given variable\n 5180 04:58:00,420 --> 04:58:03,490 1, or 4, or 8, in this case, here. 5181 04:58:03,491 --> 04:58:06,421 Well, from here, let's abstract\naway from all of the hardware 5182 04:58:06,420 --> 04:58:08,580 and really focus on\nmemory as being a grid. 5183 04:58:08,580 --> 04:58:11,911 Or, really, like a canvas that\nwe can paint any types of data 5184 04:58:13,111 --> 04:58:16,861 At the end of the day, all of this\n 5185 04:58:16,861 --> 04:58:20,761 But it's up to you and I to build\nabstractions on top of that. 5186 04:58:20,760 --> 04:58:24,390 Things like actual numbers,\ncolors, images, movies, and beyond. 5187 04:58:24,390 --> 04:58:26,701 But we'll start\nlower-level, here, first. 5188 04:58:26,701 --> 04:58:30,210 Suppose I had a program\nthat needs three integers. 5189 04:58:30,210 --> 04:58:33,060 A simple program whose purpose\nin life is to average your three 5190 04:58:33,061 --> 04:58:36,661 scores on an exam, or some such thing. 5191 04:58:36,661 --> 04:58:41,280 Suppose that your three scores were\n 5192 04:58:42,405 --> 04:58:47,290 Let's write a program that does\nthis kind of averaging for us. 5193 04:58:47,291 --> 04:58:49,121 Let me go back to VS Code, here. 5194 04:58:49,120 --> 04:58:52,530 Let me open up a file called scores.c. 5195 04:58:52,530 --> 04:58:55,091 Let me implement this as follows. 5196 04:58:55,091 --> 04:59:00,120 Let me include stdio.h at the\ntop, int main(void) as before. 5197 04:59:00,120 --> 04:59:05,580 Then, inside of main, let me\ndeclare score 1, which is 72. 5198 04:59:08,251 --> 04:59:11,401 Then, a third score, called\nscore 3, which is going to be 33. 5199 04:59:11,401 --> 04:59:15,001 Now, I'm going to use printf to print\n 5200 04:59:15,001 --> 04:59:16,780 and I can do this in\na few different ways. 5201 04:59:16,780 --> 04:59:22,111 But I'm going to print out %f, and\n 5202 04:59:22,111 --> 04:59:28,021 plus score 3, divided by 3,\nclose parentheses semicolon. 5203 04:59:28,021 --> 04:59:31,561 Some relatively simple arithmetic to\n 5204 04:59:31,561 --> 04:59:34,831 if I'm curious what my average grade\n 5205 04:59:35,881 --> 04:59:39,877 Let me, now, do make scores. 5206 04:59:39,877 --> 04:59:43,501 All right, so I've somehow\nmade an error already. 5207 04:59:43,501 --> 04:59:49,411 But this one is, actually, germane\nto a problem we, hopefully 5208 04:59:49,411 --> 04:59:51,120 won't encounter too frequently. 5209 04:59:52,120 --> 04:59:55,620 So underlined to score 1, plus\n 5210 04:59:55,620 --> 05:00:00,510 Format specifies type double, but\n 5211 05:00:02,791 --> 05:00:04,691 Because the arithmetic\nseems to check out. 5212 05:00:05,190 --> 05:00:08,820 AUDIENCE: So the computer is doing the\n 5213 05:00:08,820 --> 05:00:13,521 just gives out a value at the\nend because, well [INAUDIBLE] 5214 05:00:14,471 --> 05:00:15,901 And we'll come back to\nthis in more detail 5215 05:00:15,901 --> 05:00:18,783 but, indeed, what's happening here\n 5216 05:00:18,782 --> 05:00:20,740 obviously, because I\ndefine them right up here. 5217 05:00:20,741 --> 05:00:23,731 And I'm dividing by another\nint, 3, but the catch 5218 05:00:23,730 --> 05:00:28,150 is, recall that C when it performs math,\n 5219 05:00:28,151 --> 05:00:30,070 But integers are not\nfloating point value. 5220 05:00:30,070 --> 05:00:33,151 So if you actually want to get a\nprecise, average for your score 5221 05:00:33,151 --> 05:00:37,021 without throwing away the remainder,\n 5222 05:00:37,021 --> 05:00:39,690 it turns out, we're going to have to-- 5223 05:00:42,690 --> 05:00:46,980 [LAUGHTER] we're going to have to\n 5224 05:00:47,611 --> 05:00:50,491 And there's a few ways to\ndo this but the easiest way 5225 05:00:50,491 --> 05:00:52,801 for now, I'm going to go\nahead and do this up here 5226 05:00:52,800 --> 05:00:55,620 I'm going to change the\ndivide by 3 to divide by 3.0. 5227 05:00:55,620 --> 05:00:59,701 Because it turns out, long story short,\n 5228 05:00:59,701 --> 05:01:01,561 participating in an\narithmetic expression 5229 05:01:01,561 --> 05:01:03,991 like this is something\nlike a float, the rest 5230 05:01:03,991 --> 05:01:08,471 will be treated as promoted to\na floating point value as well. 5231 05:01:08,471 --> 05:01:13,756 So let me, now, recompile this\ncode with make scores, Enter. 5232 05:01:13,756 --> 05:01:17,761 This time it worked OK, because\nI'm treating a float as a float. 5233 05:01:19,861 --> 05:01:24,411 All right, my average is\n59.33333 and so forth. 5234 05:01:24,911 --> 05:01:27,600 So the math, presumably, checks out. 5235 05:01:27,600 --> 05:01:30,480 Floating point imprecision\nper last week aside. 5236 05:01:30,480 --> 05:01:33,541 But let's consider the\ndesign of this program. 5237 05:01:33,541 --> 05:01:40,941 What is, kind of, bad about it, or if\n 5238 05:01:40,940 --> 05:01:43,740 are we going to regret the\ndesign of this program? 5239 05:01:43,741 --> 05:01:45,251 What might not be ideal here? 5240 05:01:54,625 --> 05:01:58,480 DAVID MALAN: Yeah, so in this case,\n 5241 05:01:58,480 --> 05:02:01,400 So, if I'm hearing you\ncorrectly, this program 5242 05:02:01,401 --> 05:02:03,861 is only ever going to tell\nme this specific average. 5243 05:02:03,861 --> 05:02:05,991 I'm not even using\nsomething like, get int 5244 05:02:05,991 --> 05:02:09,051 or get float to get three different\nscores, so that's not good. 5245 05:02:09,050 --> 05:02:11,202 And suppose that we wait\nlater in the semester 5246 05:02:11,203 --> 05:02:12,661 I think other problems could arise. 5247 05:02:13,161 --> 05:02:15,280 AUDIENCE: Just thinking\nalso somewhat of an issue 5248 05:02:15,280 --> 05:02:17,161 that you can't reuse that number. 5249 05:02:17,161 --> 05:02:19,710 DAVID MALAN: I can't\nreuse the number because I 5250 05:02:19,710 --> 05:02:23,348 haven't stored the average in some\n 5251 05:02:23,348 --> 05:02:25,890 a big deal, but certainly, if\nI wanted to reuse it elsewhere 5252 05:02:26,911 --> 05:02:29,286 Let's fast-forward again, a\nlittle later in the semester 5253 05:02:29,286 --> 05:02:31,651 I don't just have three\ntest scores or exam scores 5254 05:02:34,951 --> 05:02:36,562 AUDIENCE: Yeah, if you\never want to have to take 5255 05:02:36,562 --> 05:02:39,161 the average of any number of\nscores other than 3, [INAUDIBLE] 5256 05:02:39,161 --> 05:02:42,370 DAVID MALAN: Yeah, I've sort\nof, capped this program at 3. 5257 05:02:42,370 --> 05:02:45,202 And honestly, this is, kind\nof, bordering on copy paste. 5258 05:02:45,203 --> 05:02:48,161 Even though the variables, yes, have\n 5259 05:02:49,061 --> 05:02:51,491 Imagine doing this for a\nwhole grade book for a class. 5260 05:02:51,491 --> 05:02:57,251 Having to score 4, 5, 6, 11 10, 12,\n 5261 05:02:57,251 --> 05:02:59,681 You can imagine just\nhow ugly the code starts 5262 05:02:59,681 --> 05:03:02,896 to get if you're just defining variable\n 5263 05:03:02,896 --> 05:03:07,001 So it turns out, there are\nbetter ways, in languages like C 5264 05:03:07,001 --> 05:03:11,501 if you want to have multiple\nvalues stored in memory that 5265 05:03:11,501 --> 05:03:13,300 happened to be of the same data type. 5266 05:03:13,300 --> 05:03:14,681 Let's take a look back\nat this memory, here 5267 05:03:14,681 --> 05:03:16,806 to see what these things\nmight look like in memory. 5268 05:03:16,806 --> 05:03:18,431 Here's that grid of memory. 5269 05:03:18,431 --> 05:03:20,710 Each of these recall represents a byte. 5270 05:03:20,710 --> 05:03:23,951 To be clear, if I store\nscore 1 in memory first 5271 05:03:23,951 --> 05:03:25,390 how many bytes will it take up? 5272 05:03:28,690 --> 05:03:32,838 So I might draw a score 1 as\nfilling up this part of the memory. 5273 05:03:32,838 --> 05:03:36,131 It's up to the computer as to whether it\n 5274 05:03:36,131 --> 05:03:39,550 I'm just keeping the pictures clean\n 5275 05:03:39,550 --> 05:03:42,341 If I, then, declare another\nvariable, called score 2 5276 05:03:42,341 --> 05:03:44,991 it might end up over there,\nalso taking up 4 bytes. 5277 05:03:44,991 --> 05:03:47,591 And then score 3 might end up here. 5278 05:03:47,591 --> 05:03:51,140 So that's just representing what's going\n 5279 05:03:51,140 --> 05:03:54,940 But technically speaking, to\nbe clear, per week 0, what's 5280 05:03:54,940 --> 05:03:58,841 really being stored in the computer's\n 5281 05:03:58,841 --> 05:04:03,611 32 total, in this case,\nbecause 32 bits is 4 bytes. 5282 05:04:03,611 --> 05:04:07,541 But again, it gets boring\nquickly to think in and look 5283 05:04:09,021 --> 05:04:11,381 So we'll, generally, abstract\nthis away as just using 5284 05:04:11,381 --> 05:04:13,811 decimal numbers, in this case, instead. 5285 05:04:13,811 --> 05:04:18,431 But there might be a better way to\n 5286 05:04:18,431 --> 05:04:21,761 but maybe four, maybe,\nfive, maybe 10, maybe, more 5287 05:04:21,760 --> 05:04:27,370 by declaring one variable to store\n 5288 05:04:27,370 --> 05:04:30,010 or more individual variables. 5289 05:04:30,010 --> 05:04:34,510 The way to do this is by way\nof something known as an array. 5290 05:04:34,510 --> 05:04:42,580 An array is another type of data that\n 5291 05:04:42,580 --> 05:04:45,240 of the same type back-to-back-to-back. 5292 05:04:45,241 --> 05:04:46,491 That is, to say, contiguously. 5293 05:04:46,491 --> 05:04:54,101 So an array can let you create\n 5294 05:04:54,100 --> 05:04:56,860 or even more than\nthat, but describe them 5295 05:04:56,861 --> 05:05:00,651 all using the same variable\nname, the same one name. 5296 05:05:00,651 --> 05:05:05,001 So for instance, if, for one\n 5297 05:05:05,001 --> 05:05:10,061 but I don't want to messily declare\n 5298 05:05:11,221 --> 05:05:13,390 This is today's first\nnew piece of syntax 5299 05:05:13,390 --> 05:05:15,550 the square brackets\nthat we're now seeing. 5300 05:05:15,550 --> 05:05:21,400 This line of code, here, is\nsimilar to int score 1 semicolon 5301 05:05:21,401 --> 05:05:24,621 or int score 1 equals 72 semicolon. 5302 05:05:24,620 --> 05:05:30,040 This line of code is declaring for\n 5303 05:05:30,041 --> 05:05:33,521 And that array is going\nto store three integers. 5304 05:05:34,030 --> 05:05:39,251 Because the type of that\narray is an int, here. 5305 05:05:39,251 --> 05:05:42,370 The square brackets tell the\ncomputer how many ints you want. 5306 05:05:43,241 --> 05:05:45,401 And the name is, of course, scores. 5307 05:05:45,401 --> 05:05:47,801 Which, in English, I've\ndeliberately pluralized 5308 05:05:47,800 --> 05:05:52,361 so that I can describe this array\n 5309 05:05:52,361 --> 05:05:57,230 So if I want to now assign values\n 5310 05:05:59,021 --> 05:06:04,421 I can say, scores bracket 0 equals\n 5311 05:06:04,420 --> 05:06:06,450 and scores bracket 2 equals 33. 5312 05:06:06,451 --> 05:06:08,201 The only thing weird\nthere is, admittedly 5313 05:06:08,201 --> 05:06:10,091 the square brackets which are still new. 5314 05:06:10,091 --> 05:06:14,081 But we're also, notice,\n0 indexing things. 5315 05:06:14,080 --> 05:06:16,605 To zero index means to\nstart counting at 0. 5316 05:06:16,605 --> 05:06:18,730 When we've talked about\nthat before, our four loops 5317 05:06:18,730 --> 05:06:20,260 have, generally, been zero indexed. 5318 05:06:20,260 --> 05:06:24,130 Arrays in C are zero indexed. 5319 05:06:24,131 --> 05:06:25,691 And you do not have choice over that. 5320 05:06:25,690 --> 05:06:28,810 You can't start counting at 1\nin arrays because you prefer to 5321 05:06:28,811 --> 05:06:31,091 you'd be sacrificing\none of the elements. 5322 05:06:31,091 --> 05:06:33,881 You have to start in\narrays counting from 0. 5323 05:06:33,881 --> 05:06:37,390 So out of context, this\ndoesn't solve a problem 5324 05:06:37,390 --> 05:06:39,460 but it, definitely, is\ngoing to once we have more 5325 05:06:39,460 --> 05:06:41,170 than, even, three scores here. 5326 05:06:41,170 --> 05:06:44,010 In fact, let me change\nthis program a little bit. 5327 05:06:45,710 --> 05:06:48,280 And delete these three lines, here. 5328 05:06:48,280 --> 05:06:51,341 And replace it with a\nscores variable that's 5329 05:06:51,341 --> 05:06:54,401 ready to store three total integers. 5330 05:06:54,401 --> 05:06:58,391 And then, initialize them as\nfollows, scores bracket 0 is 72 5331 05:06:58,390 --> 05:07:02,560 as before, scores bracket 1 is\ngoing to be 73, scores bracket 2 5332 05:07:04,001 --> 05:07:08,329 Notice, I do not need to say\nint before any of these lines 5333 05:07:08,329 --> 05:07:10,121 because that's been\ntaken care of, already 5334 05:07:10,120 --> 05:07:14,830 for me on line 5, where I already\n 5335 05:07:17,591 --> 05:07:21,280 Now, down here, this code needs\n 5336 05:07:21,280 --> 05:07:23,560 three variables, score 1, 2, and 3. 5337 05:07:23,561 --> 05:07:28,211 I have 1 variable, but\nthat I can index into. 5338 05:07:28,210 --> 05:07:33,010 I'm going to, here, then, do scores\n 5339 05:07:33,010 --> 05:07:37,630 plus scores bracket 2, which is\n 5340 05:07:37,631 --> 05:07:39,161 giving me back those three integers. 5341 05:07:39,161 --> 05:07:42,120 But notice, I'm using the same\nvariable name, every time. 5342 05:07:42,120 --> 05:07:45,330 And again, I'm using this new square\n 5343 05:07:45,330 --> 05:07:50,850 index into the array to get at the first\n 5344 05:07:50,850 --> 05:07:53,100 and then, to do it again down here. 5345 05:07:53,100 --> 05:07:56,167 Now, this program, still not really\n 5346 05:07:56,168 --> 05:07:58,501 I still can only store three\nscores, but we'll come back 5347 05:07:58,501 --> 05:08:00,190 to something like that before long. 5348 05:08:00,190 --> 05:08:03,210 But for now, we're just introducing\n 5349 05:08:03,210 --> 05:08:09,240 whereby, I can now store multiple\nvalues in the same variable. 5350 05:08:09,241 --> 05:08:11,371 Well, let's enhance this a bit more. 5351 05:08:11,370 --> 05:08:14,920 Instead of hard coding these scores,\n 5352 05:08:14,920 --> 05:08:19,050 let's use get int to ask\nthe user for a score. 5353 05:08:19,050 --> 05:08:22,591 Let's, then, use get int to\nask the user for another score. 5354 05:08:22,591 --> 05:08:25,800 Let's use get int to ask\nthe user for a third score 5355 05:08:25,800 --> 05:08:28,661 storing them in those\nrespective locations. 5356 05:08:28,661 --> 05:08:34,080 And, now, if I go ahead and save\n 5357 05:08:35,161 --> 05:08:38,251 Now these errors should be\ngetting a little familiar. 5358 05:08:41,010 --> 05:08:42,135 Let me give folks a moment. 5359 05:08:45,361 --> 05:08:48,480 That was not intentional, so still\n 5360 05:08:50,580 --> 05:08:53,830 Now, I'm going to go back to the bottom\n 5361 05:08:54,330 --> 05:08:55,930 We're back in business, ./scores. 5362 05:08:55,931 --> 05:08:58,181 Now, the program is getting\na little more interesting. 5363 05:08:58,181 --> 05:09:02,280 So maybe, this year was better and I got\n 5364 05:09:05,161 --> 05:09:06,631 So now, it's a little more dynamic. 5365 05:09:06,631 --> 05:09:07,531 It's a little more interesting. 5366 05:09:07,530 --> 05:09:10,238 But it's still capping the number\n 5367 05:09:10,239 --> 05:09:15,001 But now, I've introduced another,\n 5368 05:09:15,001 --> 05:09:18,368 There's this expression in programming,\n 5369 05:09:18,368 --> 05:09:20,161 [SNIFFS AIR] something\nsmells a little off. 5370 05:09:20,161 --> 05:09:24,811 And there's something off here in\n 5371 05:09:24,811 --> 05:09:29,341 Does anyone see an opportunity to\n 5372 05:09:29,341 --> 05:09:32,491 if my goal, still, is to get three\n 5373 05:09:32,491 --> 05:09:34,691 without it smelling [SNIFF] kind of bad? 5374 05:09:35,190 --> 05:09:37,200 AUDIENCE: [INAUDIBLE] use a 4 loop? 5375 05:09:37,201 --> 05:09:40,219 That way you don't have to copy\nand paste all of those scores. 5376 05:09:40,219 --> 05:09:41,421 DAVID MALAN: Yeah, exactly. 5377 05:09:41,420 --> 05:09:43,282 Those lines of code\nare almost identical. 5378 05:09:43,282 --> 05:09:45,740 And honestly, the only thing\nthat's changing is the number 5379 05:09:45,741 --> 05:09:47,361 and it's just incrementing by 1. 5380 05:09:47,361 --> 05:09:49,591 We have all of the building\nblocks to do this better. 5381 05:09:49,591 --> 05:09:51,390 So let me go ahead and improve this. 5382 05:09:55,980 --> 05:10:00,411 So for int i get 0, i\nless than 3, i plus plus. 5383 05:10:00,411 --> 05:10:03,320 Then, inside of this 4 loop,\nI can distill all three 5384 05:10:03,320 --> 05:10:05,120 of those lines into\nsomething more generic 5385 05:10:05,120 --> 05:10:10,790 like scores bracket i equals get\n 5386 05:10:10,791 --> 05:10:13,166 once, via get int, for a score. 5387 05:10:13,166 --> 05:10:16,261 So this is where arrays\nstart to get pretty powerful. 5388 05:10:16,260 --> 05:10:18,260 You don't have to hard\ncode, that is, literally 5389 05:10:18,260 --> 05:10:20,722 type in all of these magic\nnumbers like 0, 1, and 2. 5390 05:10:20,723 --> 05:10:22,431 You can start to do\nit, programmatically 5391 05:10:24,030 --> 05:10:25,611 So now, I've tightened things up. 5392 05:10:25,611 --> 05:10:28,491 I'm now, dynamically, getting\nthree different scores 5393 05:10:28,491 --> 05:10:31,027 but putting them in three\ndifferent locations. 5394 05:10:31,026 --> 05:10:34,730 And so this program, ultimately, is\n 5395 05:10:34,730 --> 05:10:41,780 Make scores, ./scores, and 100, 99,\n 5396 05:10:41,780 --> 05:10:43,700 But it's a little better designed, too. 5397 05:10:43,701 --> 05:10:45,620 If I really want to\nnitpick, there's something 5398 05:10:45,620 --> 05:10:47,361 that still smells, a little bit, here. 5399 05:10:47,361 --> 05:10:51,800 The fact that I have indeed, this\n 5400 05:10:51,800 --> 05:10:54,150 has to be the same as this number here. 5401 05:10:54,151 --> 05:10:56,431 Otherwise, who knows\nwhat's going to go wrong. 5402 05:10:56,431 --> 05:10:58,640 So what might be a\nsolution, per last week 5403 05:10:58,640 --> 05:11:01,221 to cleaning that code up further, too? 5404 05:11:01,221 --> 05:11:04,011 AUDIENCE: [INAUDIBLE]\nthe user's discretion 5405 05:11:04,010 --> 05:11:06,002 how many input scores [INAUDIBLE]. 5406 05:11:06,003 --> 05:11:09,050 DAVID MALAN: OK, so we could leave\n 5407 05:11:09,050 --> 05:11:11,760 And so we could, actually,\ndo something like this. 5408 05:11:11,760 --> 05:11:13,460 Let me take this a few steps ahead. 5409 05:11:13,460 --> 05:11:20,490 Let me say something like, int n gets\n 5410 05:11:20,491 --> 05:11:24,861 then I could actually change this\n 5411 05:11:24,861 --> 05:11:27,230 and, indeed, make the\nwhole program dynamic? 5412 05:11:27,230 --> 05:11:29,931 Ask the human how many tests\nhave there been this semester? 5413 05:11:29,931 --> 05:11:31,761 Then, you can type in\neach of those scores 5414 05:11:31,760 --> 05:11:33,968 because the loop is going\nto iterate that many times. 5415 05:11:33,969 --> 05:11:37,281 And then you'll get the average\nof one test, two test, three-- 5416 05:11:37,280 --> 05:11:41,780 well, lost another-- or however\nmany scores that were actually 5417 05:11:41,780 --> 05:11:45,021 specified by the user Yeah, question? 5418 05:11:45,021 --> 05:11:50,026 AUDIENCE: How many bits or\nbytes get used in an array? 5419 05:11:50,026 --> 05:11:52,320 DAVID MALAN: How many\nbytes are used in an array? 5420 05:11:52,320 --> 05:11:56,784 AUDIENCE: [INAUDIBLE] point of\ndoing this is to save [INAUDIBLE] 5421 05:11:56,784 --> 05:11:59,760 DAVID MALAN: So the purpose of\nan array is not to save space. 5422 05:11:59,760 --> 05:12:03,270 It's to eliminate having\nmultiple variable names 5423 05:12:03,271 --> 05:12:05,161 because that gets very messy quickly. 5424 05:12:05,161 --> 05:12:09,241 If you have score 1, score 2,\nscore 3, dot, dot, dot, score 99 5425 05:12:09,241 --> 05:12:12,361 that's, like, 99 different\nvariables, potentially 5426 05:12:12,361 --> 05:12:18,421 that you could collapse into one\nvariable that has 99 locations. 5427 05:12:18,420 --> 05:12:20,490 At different indices, or indexes. 5428 05:12:20,491 --> 05:12:22,831 As someone would say,\nthe index for an array 5429 05:12:22,830 --> 05:12:25,016 is whatever is in the square brackets. 5430 05:12:35,820 --> 05:12:37,541 DAVID MALAN: So it's a good question. 5431 05:12:37,541 --> 05:12:39,631 So if you-- I'm using\nints for everything-- 5432 05:12:39,631 --> 05:12:41,820 and honestly, we don't\nreally need ints for scores 5433 05:12:41,820 --> 05:12:46,030 because I'm not likely to get a\n 5434 05:12:46,030 --> 05:12:47,880 And so you could use\ndifferent data types. 5435 05:12:47,881 --> 05:12:50,548 And that list we had on the screen,\nearlier, is not all of them. 5436 05:12:50,547 --> 05:12:54,030 There's a data type called short,\nwhich is shorter than an int 5437 05:12:54,030 --> 05:12:59,111 you could, technically, use char, in\n 5438 05:12:59,111 --> 05:13:01,201 Generally speaking, in\nthe year 2021, these 5439 05:13:01,201 --> 05:13:05,251 tend to be over optima--\noverly optimized decisions. 5440 05:13:05,251 --> 05:13:07,201 Everyone just uses\nints, even though no one 5441 05:13:07,201 --> 05:13:10,561 is going to get a test score that's 2\n 5442 05:13:11,521 --> 05:13:14,512 Years ago, memory was expensive. 5443 05:13:14,512 --> 05:13:16,470 And every one of your\ninstincts would have been 5444 05:13:16,471 --> 05:13:18,961 spot on because memory is so tight. 5445 05:13:18,960 --> 05:13:21,190 But, nowadays, we don't\nworry as much about it. 5446 05:13:21,690 --> 05:13:26,816 AUDIENCE: I have a question\nabout the error [INAUDIBLE].. 5447 05:13:26,816 --> 05:13:30,865 Could it-- when you're doing a\nhash problem on the problem set-- 5448 05:13:30,866 --> 05:13:34,271 DAVID MALAN: So what is the\ndifference between dividing two ints 5449 05:13:34,271 --> 05:13:36,640 and not getting an error, as\nyou might have encountered 5450 05:13:36,640 --> 05:13:40,181 in a program like cash,\nversus dividing two ints 5451 05:13:40,181 --> 05:13:42,411 and getting an error\nlike I did a moment ago? 5452 05:13:42,411 --> 05:13:46,541 The problem with the scenario I created\n 5453 05:13:46,541 --> 05:13:52,241 And I was telling printf to use a %f,\n 5454 05:13:52,241 --> 05:13:54,841 of dividing integers by another integer. 5455 05:13:54,841 --> 05:13:57,190 So it was printf that was yelling at me. 5456 05:13:57,190 --> 05:14:00,190 I'm guessing in the scenario you're\n 5457 05:14:00,190 --> 05:14:03,440 printf was not involved in\nthat particular line of code. 5458 05:14:03,440 --> 05:14:05,126 So that's the difference, there. 5459 05:14:05,920 --> 05:14:09,370 So we, now, have this\nability to create an array. 5460 05:14:09,370 --> 05:14:11,771 And an array can store multiple values. 5461 05:14:11,771 --> 05:14:15,710 What, then, might we do that's more\n 5462 05:14:16,210 --> 05:14:18,490 Well, let's take this one step further. 5463 05:14:18,491 --> 05:14:25,391 As opposed to just storing 72, 73, 33 or\n 5464 05:14:25,390 --> 05:14:30,190 because again, an array gives you one\n 5465 05:14:30,190 --> 05:14:32,620 or indices therein,\nbracket 0, bracket 1 5466 05:14:32,620 --> 05:14:35,591 bracket 2 on up, if it\nwere even bigger than that. 5467 05:14:35,591 --> 05:14:40,361 Let's, now, start to consider something\n 5468 05:14:40,361 --> 05:14:43,091 Chars, being 1 byte each,\nso they're even smaller 5469 05:14:43,091 --> 05:14:44,350 they take up much less space. 5470 05:14:44,350 --> 05:14:46,308 And, indeed, if I wanted\nto say a message like 5471 05:14:46,309 --> 05:14:48,460 hi I could use three variables. 5472 05:14:48,460 --> 05:14:52,780 If I wanted a program to print,\nhi, H-I exclamation point 5473 05:14:52,780 --> 05:14:57,490 I could, of course, store those in\n 5474 05:14:57,491 --> 05:15:00,971 And let's, for the sake of discussion,\n 5475 05:15:00,971 --> 05:15:03,941 Let me create a new\nprogram, now, in VS Code. 5476 05:15:03,940 --> 05:15:07,181 This time, I'm going to call it hi.c. 5477 05:15:07,181 --> 05:15:09,911 And I'm not going to bother\nwith the CS50 library. 5478 05:15:09,911 --> 05:15:11,920 I just need the standard\nI/O one, for now. 5479 05:15:13,480 --> 05:15:16,661 And then, inside of main, I'm going\n 5480 05:15:16,661 --> 05:15:20,021 And this is already, hopefully,\nstriking you as a bad idea. 5481 05:15:20,021 --> 05:15:22,570 But we'll go down this\nroad, temporarily 5482 05:15:22,570 --> 05:15:26,561 with c1, and c2, and, finally, c3. 5483 05:15:26,561 --> 05:15:29,921 Storing each character in\nthe phrase I want to print 5484 05:15:29,920 --> 05:15:33,710 and I'm going to print this\nin a different way than usual. 5485 05:15:35,140 --> 05:15:38,740 And we've, generally, dealt with\n 5486 05:15:38,741 --> 05:15:45,861 But %c, %c, %c, will let me print out\n 5487 05:15:45,861 --> 05:15:48,681 So, kind of, a stupid way\nof printing out a string. 5488 05:15:48,681 --> 05:15:51,201 So we already have a solution\nto this problem last week. 5489 05:15:51,201 --> 05:15:54,800 But let's poke around at what's\n 5490 05:15:58,736 --> 05:16:00,611 But we, again, could\nhave done this last week 5491 05:16:00,611 --> 05:16:03,791 with a string and just one\nvariable, or even, 0, at that. 5492 05:16:03,791 --> 05:16:07,480 But let's start converting\nthese characters 5493 05:16:07,480 --> 05:16:12,010 to their apparent numeric equivalents\n 5494 05:16:12,010 --> 05:16:16,570 Let me modify these %c's,\njust to be fun, to be %i's. 5495 05:16:16,570 --> 05:16:20,440 And let me add some spaces so there\n 5496 05:16:20,440 --> 05:16:24,610 Let me, now, recompile\nhi, and let me rerun it. 5497 05:16:24,611 --> 05:16:27,161 Just to guess, what should\nI see on the screen now? 5498 05:16:30,960 --> 05:16:32,296 AUDIENCE: The ASCII values? 5499 05:16:32,296 --> 05:16:34,020 DAVID MALAN: The ASCII values. 5500 05:16:34,021 --> 05:16:36,480 And it's intentional that\nI keep using the same word 5501 05:16:36,480 --> 05:16:42,510 hi, because it should be, hopefully,\n 5502 05:16:42,510 --> 05:16:46,380 Which, is to say, that c knows about\n 5503 05:16:46,381 --> 05:16:48,581 and can do this conversion\nfor us automatically. 5504 05:16:48,580 --> 05:16:51,930 And it seems to be doing it\nimplicitly for us, so to speak. 5505 05:16:51,931 --> 05:16:55,261 Notice that c1, c2 and\nc3 are, obviously, chars 5506 05:16:55,260 --> 05:16:58,680 but printf is able to tolerate\nprinting them as integers. 5507 05:16:58,681 --> 05:17:03,131 If I really want it to be pedantic,\n 5508 05:17:03,131 --> 05:17:05,581 known as typecasting,\nwhere I can actually 5509 05:17:05,580 --> 05:17:10,870 convert one data type to another,\n 5510 05:17:10,870 --> 05:17:14,161 And we saw in week 0,\nchars, or characters 5511 05:17:14,161 --> 05:17:17,760 are just numbers, like 72, 73, and 33. 5512 05:17:17,760 --> 05:17:21,940 So I can use this parenthetical\n 5513 05:17:21,940 --> 05:17:26,883 [LAUGHTER] three chars to\nthree integers, instead. 5514 05:17:26,883 --> 05:17:28,800 So that's what I meant\nto type the first time. 5515 05:17:30,061 --> 05:17:33,541 So parenthesis, int,\nclose parenthesis says 5516 05:17:33,541 --> 05:17:39,101 take whatever variable comes after this,\n 5517 05:17:39,100 --> 05:17:42,900 The effect is going to be no different,\n 5518 05:17:42,901 --> 05:17:49,171 then running ./hi still works the same,\n 5519 05:17:49,920 --> 05:17:53,520 And we can do this all day long,\nchars to ints, floats to ints 5520 05:17:54,510 --> 05:17:56,148 Sometimes, it's equivalent. 5521 05:17:56,149 --> 05:17:58,066 Other times, you're going\nto lose information. 5522 05:17:58,065 --> 05:18:01,530 Taking a float to an\nint, just intuitively 5523 05:18:01,530 --> 05:18:04,050 is going to throw away everything\nafter the decimal point 5524 05:18:04,050 --> 05:18:06,940 because an int has no decimal point. 5525 05:18:06,940 --> 05:18:09,360 But, for now, I'm going to\nrewind to the version of this 5526 05:18:09,361 --> 05:18:13,411 that just did implicit-type\nconversion, or implicit casting 5527 05:18:13,411 --> 05:18:17,611 just to demonstrate that we can, indeed,\n 5528 05:18:18,210 --> 05:18:20,630 Let me go ahead and do\nthis, now, the week 1 way. 5529 05:18:21,631 --> 05:18:24,466 Let's just do printf, quote-unquote-- 5530 05:18:24,466 --> 05:18:28,890 Actually, let's do this, string\ns equals quote-unquote hi 5531 05:18:28,890 --> 05:18:33,940 and then let's do a simple printf\n 5532 05:18:33,940 --> 05:18:36,780 So now I've rewound to last\nweek, where we began this story 5533 05:18:36,780 --> 05:18:40,920 but you'll notice that, if we\nkeep playing around with this-- 5534 05:18:43,120 --> 05:18:47,730 Oh, and let me introduce the C50 library\n 5535 05:18:47,730 --> 05:18:50,521 Let me go ahead and\nrecompile, rerun this 5536 05:18:50,521 --> 05:18:52,529 we seem to be coding in circles, here. 5537 05:18:52,528 --> 05:18:55,070 Like, I've just done the same\nthing multiple, different ways. 5538 05:18:55,070 --> 05:18:57,661 But there's clearly\nan equivalence, then 5539 05:18:57,661 --> 05:19:01,239 between sequences of chars and strings. 5540 05:19:01,239 --> 05:19:03,031 And if you do it the\nreal pedantic way, you 5541 05:19:03,030 --> 05:19:07,650 have three different variables, c1, c2,\n 5542 05:19:07,651 --> 05:19:12,131 or you can just treat them all together\n 5543 05:19:12,131 --> 05:19:16,291 But it turns out that\nstrings are actually 5544 05:19:16,291 --> 05:19:22,320 implemented by the computer\nin a pretty now familiar way. 5545 05:19:22,320 --> 05:19:28,643 What might a string actually be\nas of this point in the story? 5546 05:19:28,643 --> 05:19:29,850 Where are we going with this? 5547 05:19:29,850 --> 05:19:31,183 Let me try to look further back. 5548 05:19:32,611 --> 05:19:34,861 AUDIENCE: Can a string like\nthis be an array of chars? 5549 05:19:34,861 --> 05:19:37,671 DAVID MALAN: Yeah, a string\nmight be, and indeed is, just 5550 05:19:39,061 --> 05:19:41,451 So last week we took for\ngranted that strings exist. 5551 05:19:41,451 --> 05:19:43,791 Technically, strings exist,\nbut they're implemented 5552 05:19:43,791 --> 05:19:47,331 as arrays of characters,\nwhich actually opens up 5553 05:19:47,330 --> 05:19:50,030 some interesting possibilities for us. 5554 05:19:50,030 --> 05:19:52,560 Because, let me see, let\nme see if I can do this. 5555 05:19:52,561 --> 05:19:55,820 Let me try to print out,\nnow, three integers again. 5556 05:19:55,820 --> 05:20:01,791 But if string s is but an array, as you\n 5557 05:20:01,791 --> 05:20:04,021 s bracket 1, and s bracket 2. 5558 05:20:04,021 --> 05:20:07,911 So maybe I can start poking\naround inside of strings 5559 05:20:07,911 --> 05:20:09,890 even though we didn't\ndo this last week, so I 5560 05:20:09,890 --> 05:20:11,521 can get at those individual values. 5561 05:20:11,521 --> 05:20:15,530 So make hi, ./hi and,\nvoila, there we go again. 5562 05:20:15,530 --> 05:20:20,468 It's the same 72, 73, 33, but\nnow, I'm sort of, hopefully 5563 05:20:20,469 --> 05:20:22,761 like, wrapping my mind around\nthe fact that, all right 5564 05:20:22,760 --> 05:20:25,911 a string is just an array of\ncharacters, and arrays, you 5565 05:20:25,911 --> 05:20:29,221 can index into them using this\nnew square bracket notation. 5566 05:20:29,221 --> 05:20:32,301 So I can get at any one of\nthese individual characters 5567 05:20:32,300 --> 05:20:38,315 and, heck, convert it to an\ninteger like we did in week 0. 5568 05:20:38,315 --> 05:20:41,271 Let me get a little curious now. 5569 05:20:41,271 --> 05:20:44,280 What else might be in\nthe computer's memory? 5570 05:20:44,280 --> 05:20:47,810 Well, let's-- I'll go back to the\n 5571 05:20:47,811 --> 05:20:50,121 Here might be how we\noriginally implemented hi 5572 05:20:50,120 --> 05:20:53,060 with three variables, c1, c2, c3. 5573 05:20:53,061 --> 05:20:55,761 Of course, that map to these\ndecimal digits or equivalent 5574 05:20:57,140 --> 05:20:59,570 But what was this\nlooking like in memory? 5575 05:20:59,570 --> 05:21:02,510 Literally, when you create a\nstring in memory, like this 5576 05:21:02,510 --> 05:21:05,501 string s equals quote-unquote hi,\nlet's consider what's going on 5577 05:21:05,501 --> 05:21:06,876 underneath the hood, so to speak. 5578 05:21:06,876 --> 05:21:11,751 Well, as an abstraction, a string,\n 5579 05:21:11,751 --> 05:21:13,177 it would seem, 3 bytes, right? 5580 05:21:13,177 --> 05:21:15,260 I've gotten rid of the\nbars, there, because if you 5581 05:21:15,260 --> 05:21:19,911 think of a string as a type, I'm just\n 5582 05:21:19,911 --> 05:21:24,471 But technically, a string, we've\njust revealed, is an array 5583 05:21:26,091 --> 05:21:28,010 So technically, if the\nstring is called s 5584 05:21:28,010 --> 05:21:30,230 s bracket 0 will give\nyou the first character 5585 05:21:30,230 --> 05:21:34,070 s bracket 1, the second,\nand s bracket 3, the third. 5586 05:21:34,070 --> 05:21:37,550 But let me ask this question now,\n 5587 05:21:37,550 --> 05:21:40,820 is the only thing in\nyour computer memory 5588 05:21:40,820 --> 05:21:45,050 and the ability, like a canvas to draw\n 5589 05:21:45,050 --> 05:21:46,881 or whatever on it, but\nthat's it, like this 5590 05:21:46,881 --> 05:21:50,031 is what your Mac, and PC, and\nphone ultimately reduced to. 5591 05:21:50,030 --> 05:21:53,990 Suppose that I'm running a piece\n 5592 05:21:53,991 --> 05:21:57,261 and now I write down\nbye exclamation point. 5593 05:21:57,260 --> 05:21:59,120 Well, where might that go in memory? 5594 05:22:00,105 --> 05:22:03,594 B-Y-E. And then the next thing I type\n 5595 05:22:03,594 --> 05:22:05,511 My memory just might get\nfilled up, over time 5596 05:22:05,510 --> 05:22:08,570 with things that you or\nsomeone else are typing. 5597 05:22:08,570 --> 05:22:14,841 But then how does the computer know if,\n 5598 05:22:14,841 --> 05:22:20,411 is right after H-I exclamation point\n 5599 05:22:23,690 --> 05:22:27,330 All we have are bytes, or 0s and 1s. 5600 05:22:27,330 --> 05:22:29,990 So if you were designing\nthis, how would you 5601 05:22:29,991 --> 05:22:32,541 implement some kind of\ndelimiter between the two? 5602 05:22:32,541 --> 05:22:34,521 Or figure out what the\nlength of a string is? 5603 05:22:36,408 --> 05:22:39,367 DAVID MALAN: OK, so the right\nanswer is use a nul character 5604 05:22:39,367 --> 05:22:41,450 and for those who don't\nknow, what does that mean? 5605 05:22:43,753 --> 05:22:45,710 DAVID MALAN: Yeah, so\nit's a special character. 5606 05:22:45,710 --> 05:22:47,780 Let me describe it as\na sentinel character. 5607 05:22:47,780 --> 05:22:49,835 Humans decided some\ntime ago that you know 5608 05:22:49,835 --> 05:22:52,820 what, if we want to delineate\nwhere one string ends 5609 05:22:52,820 --> 05:22:56,271 and where the next one begins,\nwe just need some special symbol. 5610 05:22:56,271 --> 05:22:59,450 And the symbol they'll use is\ngenerally written as backslash 0. 5611 05:22:59,450 --> 05:23:03,816 This is just shorthand notation\nfor literally eight 0 bits. 5612 05:23:06,800 --> 05:23:10,400 And the nickname for eight\n0 bits, in this context 5613 05:23:13,190 --> 05:23:16,170 And we can actually see this as follows. 5614 05:23:16,170 --> 05:23:18,173 If you look at the\ncorresponding decimal digits 5615 05:23:18,173 --> 05:23:20,841 like you could do by doing out\nthe math or doing the conversion 5616 05:23:20,841 --> 05:23:25,820 like we've done in code, you would\n 5617 05:23:25,820 --> 05:23:30,861 but then 1 extra byte that's sort of\n 5618 05:23:30,861 --> 05:23:33,381 And now I've just written\nit as the decimal number 0. 5619 05:23:33,381 --> 05:23:36,381 The implication of this is\nthat the computer is apparently 5620 05:23:36,381 --> 05:23:40,956 using, not 3 bytes to store\na word like hi, but 4 bytes. 5621 05:23:40,955 --> 05:23:46,310 Whatever the length of the string is,\n 5622 05:23:46,311 --> 05:23:48,901 that demarcates the end of the string. 5623 05:23:48,901 --> 05:23:50,941 So we might draw it like this instead. 5624 05:23:50,940 --> 05:23:55,610 And this character is, again,\npronounced nul, or written N-U-L. 5625 05:23:56,580 --> 05:23:59,330 If humans, at the end of the day,\n 5626 05:23:59,330 --> 05:24:01,163 they just needed to\ndecide, all right, well 5627 05:24:01,163 --> 05:24:04,251 how do we distinguish\none string from another? 5628 05:24:04,251 --> 05:24:06,920 It's a lot easier with\nchars, individually, it's 5629 05:24:06,920 --> 05:24:09,710 a lot easier with ints, it's\neven easier With floats, why? 5630 05:24:09,710 --> 05:24:13,880 Because, per that chart earlier,\n 5631 05:24:13,881 --> 05:24:16,070 Every int is always 4 bytes. 5632 05:24:16,070 --> 05:24:19,010 Every long is always 8 bytes. 5633 05:24:20,540 --> 05:24:24,021 Well, hi is 1, 2, 3 with\nan exclamation point. 5634 05:24:24,021 --> 05:24:27,290 Bye is 1, 2, 3, 4 with\nan exclamation point. 5635 05:24:27,290 --> 05:24:30,710 David is D-A-V-I-D, five\nwithout an exclamation point. 5636 05:24:30,710 --> 05:24:34,470 And so a string can be\nany number of bytes long 5637 05:24:34,471 --> 05:24:36,961 so you somehow need to\ndraw a line in the sand 5638 05:24:36,960 --> 05:24:40,967 to separate in memory\none string from another. 5639 05:24:40,967 --> 05:24:43,672 So what's the implication of this? 5640 05:24:43,672 --> 05:24:45,130 Well, let me go back to code, here. 5641 05:24:46,471 --> 05:24:51,390 This is a bit dangerous, but I'm going\n 5642 05:24:53,471 --> 05:24:57,511 So let me go ahead and\nrecompile, make hi. 5643 05:25:02,881 --> 05:25:06,811 Now let me go ahead and\nrerun make hi, ./hi, Enter. 5644 05:25:07,841 --> 05:25:10,921 So you can actually see in the\ncomputer, unbeknownst to you 5645 05:25:10,920 --> 05:25:14,090 previously, that there's indeed\nsomething else going on there. 5646 05:25:14,091 --> 05:25:17,140 And if I were to make one\nother variant of this program-- 5647 05:25:17,140 --> 05:25:19,890 let's get rid of just this\none word and let's have two. 5648 05:25:19,890 --> 05:25:21,810 So let me give myself\nanother string called t 5649 05:25:21,811 --> 05:25:26,070 for instance, just this common\n 5650 05:25:26,070 --> 05:25:29,161 Let me, then print out with %s. 5651 05:25:29,161 --> 05:25:35,045 And let me also print out with %s,\n 5652 05:25:35,045 --> 05:25:38,580 Let me recompile this program,\nand obviously the out-- 5653 05:25:38,580 --> 05:25:41,730 ugh-- this is what happens\nwhen I go too fast. 5654 05:25:41,730 --> 05:25:45,001 All right, third mistake\ntoday, close quote. 5655 05:25:52,471 --> 05:25:54,871 Now we have a program that's\nprinting both hi and bye 5656 05:25:54,870 --> 05:25:58,980 only so that we can consider what's\n 5657 05:25:58,980 --> 05:26:04,471 If s is storing hi and\napparently one bonus byte that 5658 05:26:04,471 --> 05:26:07,501 demarcates the end of that\nstring, bye is apparently 5659 05:26:07,501 --> 05:26:10,673 going to fit into the\nlocation directly after. 5660 05:26:10,673 --> 05:26:13,591 And it's wrapping around, but that's\n 5661 05:26:13,591 --> 05:26:16,260 But bye, B-Y-E exclamation\npoint is taking up 5662 05:26:16,260 --> 05:26:23,208 1, 2, 3, 4, plus a fifth byte, as well. 5663 05:26:23,208 --> 05:26:27,841 All right, any questions on this\n 5664 05:26:27,841 --> 05:26:29,820 And we'll contextualize\nthis, before long 5665 05:26:29,820 --> 05:26:32,100 so that this isn't just\nlike, OK, who really cares? 5666 05:26:32,100 --> 05:26:34,990 This is going to be the source\nof actually implementing things. 5667 05:26:34,991 --> 05:26:37,771 In fact for problem set 2, like\ncryptography, and encryption 5668 05:26:37,771 --> 05:26:39,728 and scrambling actual human messages. 5669 05:26:40,771 --> 05:26:44,911 AUDIENCE: So normally if\nyou were to not use string 5670 05:26:44,911 --> 05:26:47,741 you would just make a character\nrange that would declare 5671 05:26:47,741 --> 05:26:50,841 how many characters there are so\n 5672 05:26:51,591 --> 05:26:53,741 DAVID MALAN: A good\nquestion, too and let 5673 05:26:53,741 --> 05:26:56,376 me summarize as, if we were\ninstead to use chars all the time 5674 05:26:56,376 --> 05:26:59,501 we would indeed have to know in advance\n 5675 05:26:59,501 --> 05:27:03,010 string that you're storing, how, then,\n 5676 05:27:03,010 --> 05:27:05,260 because when you CS50 wrote\nthe get string function 5677 05:27:05,260 --> 05:27:07,450 we obviously don't know\nhow long the words are 5678 05:27:07,451 --> 05:27:09,281 going to be that you all are typing in. 5679 05:27:09,280 --> 05:27:12,820 It turns out, two weeks from\nnow we'll see that get string 5680 05:27:12,820 --> 05:27:15,580 uses a technique known as\ndynamic memory allocation. 5681 05:27:15,580 --> 05:27:20,030 And it's going to grow or shrink\n 5682 05:27:22,181 --> 05:27:25,710 AUDIENCE: Why are we using a nul value? 5683 05:27:26,986 --> 05:27:28,111 DAVID MALAN: Good question. 5684 05:27:28,111 --> 05:27:31,140 Why are we using a nul value,\nisn't it wasting a byte? 5685 05:27:31,890 --> 05:27:37,471 But I claim there's really no other way\n 5686 05:27:37,471 --> 05:27:44,009 from the start of another, unless we\n 5687 05:27:44,008 --> 05:27:46,800 All we have, at the end of the day,\n 5688 05:27:46,800 --> 05:27:50,161 Therefore, all we can do is spin\nthose bits in some creative way 5689 05:27:51,780 --> 05:27:54,970 So we're minimally going to spend\n1 byte to solve this problem. 5690 05:27:55,471 --> 05:28:00,158 AUDIENCE: How does our memory device\n 5691 05:28:00,157 --> 05:28:03,530 the /n if we don't have\nit stored as a char? 5692 05:28:05,170 --> 05:28:08,950 how does the computer know to move\n 5693 05:28:08,951 --> 05:28:12,251 So /n, even though it\nlooks like two characters 5694 05:28:12,251 --> 05:28:16,151 it's actually stored as just 1\nbyte in the computer's memory. 5695 05:28:16,151 --> 05:28:18,618 There's a mapping between\nit and an actual number. 5696 05:28:18,617 --> 05:28:21,700 And you can see that, for instance,\n 5697 05:28:21,701 --> 05:28:25,485 AUDIENCE: So with that being\nstored would be the [INAUDIBLE].. 5698 05:28:26,681 --> 05:28:32,471 If I had put a /n in my code here,\n 5699 05:28:32,471 --> 05:28:36,101 and here, that would actually shift\n 5700 05:28:36,100 --> 05:28:41,001 need to make room for a /n\nhere and another one over here. 5701 05:28:41,001 --> 05:28:43,173 So it would take two\nmore bytes, exactly. 5702 05:28:43,841 --> 05:28:50,311 AUDIENCE: So if hi exclamation\n 5703 05:28:50,311 --> 05:28:56,891 too as 72, 73, 33, if we are to\n 5704 05:28:56,890 --> 05:29:03,350 and convert them into binary how\n 5705 05:29:04,651 --> 05:29:06,651 DAVID MALAN: And what's\nthe last thing you said? 5706 05:29:08,066 --> 05:29:09,960 DAVID MALAN: It's context sensitive. 5707 05:29:09,960 --> 05:29:12,710 So if, at the end of the day, all\n 5708 05:29:12,710 --> 05:29:16,640 like 72, 73, 33, recall\nthat it's up to the program 5709 05:29:16,640 --> 05:29:19,730 to decide, based on context,\nhow to interpret them. 5710 05:29:19,730 --> 05:29:23,570 And I simplified this story in week 0\n 5711 05:29:23,570 --> 05:29:27,170 as RGB colors, and iMessage\nor a text messaging program 5712 05:29:27,170 --> 05:29:31,700 interprets them as letters, and\n 5713 05:29:31,701 --> 05:29:36,800 How those programs do it is by way\n 5714 05:29:37,341 --> 05:29:39,133 And in fact, later this\nsemester, we'll see 5715 05:29:39,133 --> 05:29:43,761 a data type via which you can represent\n 5716 05:29:43,760 --> 05:29:46,501 and red value, a green\nvalue, and a blue value. 5717 05:29:46,501 --> 05:29:48,861 So we'll see other data types as well. 5718 05:29:49,361 --> 05:29:53,581 AUDIENCE: It seems easy enough to just\n 5719 05:29:53,580 --> 05:29:56,450 so why do we have integers\nand long integers? 5720 05:29:56,451 --> 05:29:59,453 Why can't we make everything\nvariable in its data size? 5721 05:29:59,453 --> 05:30:01,161 DAVID MALAN: Really\ninteresting question. 5722 05:30:01,161 --> 05:30:04,370 Why could we not just make all\ndata types variable in size? 5723 05:30:04,370 --> 05:30:07,820 And some languages, some\nlibraries do exactly this. 5724 05:30:07,820 --> 05:30:11,361 C is an older language, and\nbecause memory was expensive 5725 05:30:12,561 --> 05:30:14,901 The reality was you\ngain benefits from just 5726 05:30:14,901 --> 05:30:17,271 standardizing the size of these things. 5727 05:30:17,271 --> 05:30:19,671 You also get performance\nincreases in the sense 5728 05:30:19,670 --> 05:30:23,880 that if you know every int is\n4 bytes, you can very quickly 5729 05:30:23,881 --> 05:30:26,480 and we'll see this next week,\njump from integer to another 5730 05:30:26,480 --> 05:30:30,861 to another in memory just by adding\n 5731 05:30:30,861 --> 05:30:32,691 You can very quickly poke around. 5732 05:30:32,690 --> 05:30:35,782 Whereas, if you had variable\nlength numbers, you would have to 5733 05:30:35,782 --> 05:30:38,240 kind of, follow, follow, follow,\nlooking for the end of it. 5734 05:30:38,241 --> 05:30:41,041 Follow, follow-- you would have to\n 5735 05:30:41,041 --> 05:30:42,583 So that's a topic we'll come back to. 5736 05:30:42,582 --> 05:30:44,960 But it was generally for efficiency. 5737 05:30:46,431 --> 05:30:52,203 AUDIENCE: Why not store the\nnul character [INAUDIBLE] 5738 05:30:52,203 --> 05:30:55,781 DAVID MALAN: Good question\nwhy not store the-- 5739 05:30:55,780 --> 05:30:59,800 why not store the nul\ncharacter at the beginning? 5740 05:30:59,800 --> 05:31:06,150 You could-- let's see, why\nnot store it at the beginning? 5741 05:31:09,341 --> 05:31:12,585 You could absolutely--\nwell, could you do this? 5742 05:31:15,841 --> 05:31:20,640 If you were to do that\nat the beginning-- 5743 05:31:22,681 --> 05:31:24,888 No, because I finally thought\nof a problem with this. 5744 05:31:24,888 --> 05:31:26,743 If you store it at\nthe beginning instead 5745 05:31:26,743 --> 05:31:29,161 we'll see in just a moment how\nyou can actually write code 5746 05:31:29,161 --> 05:31:31,411 to figure out where\nthe end of a string is 5747 05:31:31,411 --> 05:31:33,811 and the problem there\nis wouldn't necessarily 5748 05:31:33,811 --> 05:31:37,261 know if you eventually hit a\n0 at the end of the string 5749 05:31:37,260 --> 05:31:41,070 because it's the number 0 in the\n 5750 05:31:41,070 --> 05:31:44,440 or if it's the context of some\nother data type, altogether. 5751 05:31:44,440 --> 05:31:46,860 So the fact that we've standardized-- 5752 05:31:46,861 --> 05:31:50,820 the fact that we've standardized\nstrings as ending with nul 5753 05:31:50,820 --> 05:31:54,916 means that we can reliably distinguish\n 5754 05:31:54,916 --> 05:31:56,820 And that's actually a\nperfect segue way, now 5755 05:31:56,820 --> 05:31:59,954 to actually using this\nprimitive to building up 5756 05:31:59,954 --> 05:32:02,621 our own code that manipulates\nthese things that are lower level. 5757 05:32:03,820 --> 05:32:05,911 Let me create a new file called length. 5758 05:32:05,911 --> 05:32:10,260 And let's use this basic idea to\n 5759 05:32:10,260 --> 05:32:14,980 is after it's been stored in a variable. 5760 05:32:16,120 --> 05:32:20,790 Let me include both the CS50\nheader and the standard I/O header 5761 05:32:20,791 --> 05:32:25,511 give myself int main(void) again\n 5762 05:32:25,510 --> 05:32:28,320 Let me prompt the user for\na string s and I'll ask them 5763 05:32:28,320 --> 05:32:32,431 for a string like their name, here. 5764 05:32:32,431 --> 05:32:37,681 And then let me name it more\nverbosely name this time. 5765 05:32:37,681 --> 05:32:39,431 Now let me go ahead and do this. 5766 05:32:39,431 --> 05:32:44,521 Let me iterate over every\ncharacter in this string 5767 05:32:44,521 --> 05:32:46,440 in order to figure out\nwhat its length is. 5768 05:32:46,440 --> 05:32:49,320 So initially, I'm going\nto go ahead and say this 5769 05:32:49,320 --> 05:32:52,300 int length equals 0, because\nI don't know what it is yet. 5770 05:32:52,300 --> 05:32:53,550 So we're going to start at 0. 5771 05:32:53,550 --> 05:32:56,670 And then while the following is true-- 5772 05:32:56,670 --> 05:33:01,630 while-- let me-- do I want to do this? 5773 05:33:01,631 --> 05:33:04,320 Let me change this to i,\njust for clarity, let me do 5774 05:33:04,320 --> 05:33:10,050 this, while name bracket i does not\n 5775 05:33:10,050 --> 05:33:13,440 So I typed it on the slide is N-U-L,\n 5776 05:33:13,440 --> 05:33:17,925 you actually use its numeric equivalent,\n 5777 05:33:17,925 --> 05:33:23,190 While name bracket i does not equal the\n 5778 05:33:23,190 --> 05:33:26,730 and increment i to i plus plus. 5779 05:33:26,730 --> 05:33:29,730 And then down here I'm going\nto print out the value of i 5780 05:33:29,730 --> 05:33:33,530 to see what we actually get,\nprinting out the value of i. 5781 05:33:33,530 --> 05:33:35,280 All right, so what's\ngoing to happen here? 5782 05:33:39,001 --> 05:33:43,831 ./length and let me type in something\n 5783 05:33:45,001 --> 05:33:48,210 Let me try bye,\nexclamation point, Enter. 5784 05:33:50,131 --> 05:33:52,771 Let me try my own name, David, Enter. 5785 05:33:54,230 --> 05:33:56,140 So what's actually going on here? 5786 05:33:56,140 --> 05:33:58,751 Well, it seems that\nby way of this 4 loop 5787 05:33:58,751 --> 05:34:00,883 we are specifying a\nlocal variable called 5788 05:34:00,883 --> 05:34:03,841 i initialized to 0, because we're\n 5789 05:34:04,841 --> 05:34:08,311 I'm then asking the\nquestion, does location 0 5790 05:34:08,311 --> 05:34:13,561 that is i in the name string,\nwhich we now know is an array 5791 05:34:15,960 --> 05:34:19,905 Because if it doesn't, that means it's\n 5792 05:34:21,901 --> 05:34:25,171 Then, let's come back around to line\n 5793 05:34:26,850 --> 05:34:30,681 So does name bracket 1 not equal /0? 5794 05:34:30,681 --> 05:34:36,331 Well, if it doesn't, and it won't\nif it's an i, or a y, or an a 5795 05:34:36,330 --> 05:34:39,750 based on what I typed in, we're\ngoing to increment i once more. 5796 05:34:39,751 --> 05:34:43,201 Fast-forward to the end of the story,\n 5797 05:34:43,201 --> 05:34:46,681 technically, one space\npast the end of the string 5798 05:34:46,681 --> 05:34:49,771 name bracket i will equal /0. 5799 05:34:49,771 --> 05:34:54,221 So I don't increment i anymore, I\n 5800 05:34:54,221 --> 05:34:58,771 So what we seem to have here with some\n 5801 05:34:58,771 --> 05:35:03,331 is a program that figures out the length\n 5802 05:35:03,330 --> 05:35:06,120 Let's practice our abstraction\nand decompose this into 5803 05:35:06,120 --> 05:35:07,530 maybe, a helper function here. 5804 05:35:07,530 --> 05:35:11,370 Let me grab all of this\ncode here, and assume 5805 05:35:11,370 --> 05:35:15,841 for the sake of discussion for a moment,\n 5806 05:35:18,001 --> 05:35:21,091 And the length of the string\nis name that I want to get 5807 05:35:21,091 --> 05:35:25,260 and then I'll go ahead and print\nout, just as before with %i 5808 05:35:26,658 --> 05:35:28,951 So now I'm abstracting away\nthis notion of figuring out 5809 05:35:29,992 --> 05:35:32,730 That's an opportunity for to\nme to create my own function. 5810 05:35:32,730 --> 05:35:35,775 If I want to create a\nfunction called string length 5811 05:35:35,776 --> 05:35:39,871 I'll claim that I want to\ntake a string as input 5812 05:35:39,870 --> 05:35:45,120 and what should I have this\nfunction return as its return type? 5813 05:35:45,120 --> 05:35:50,350 What should get string\npresumably return? 5814 05:35:53,280 --> 05:35:55,198 Float really wouldn't\nmake sense because we're 5815 05:35:55,198 --> 05:35:57,637 measuring things that are integers. 5816 05:35:57,637 --> 05:35:59,220 In this case, the length of something. 5817 05:35:59,221 --> 05:36:00,901 So indeed, let's have it return an int. 5818 05:36:00,901 --> 05:36:03,641 I can use the same\ncode as before, so I'm 5819 05:36:03,640 --> 05:36:06,435 going to paste what I\ncut earlier in the file. 5820 05:36:06,436 --> 05:36:10,921 The only thing I have to change\nis the name of the variable. 5821 05:36:10,920 --> 05:36:14,501 Because now this function,\nI decided arbitrarily 5822 05:36:14,501 --> 05:36:17,390 that I'm going to call it\ns, just to be more generic. 5823 05:36:17,390 --> 05:36:20,175 So I'm going to look at s\nbracket i at each location. 5824 05:36:20,175 --> 05:36:23,050 And I don't want to print it at the\n 5825 05:36:23,050 --> 05:36:25,510 What's the line of code I should\ninclude here if I actually 5826 05:36:25,510 --> 05:36:28,265 want to hand back the total length? 5827 05:36:31,372 --> 05:36:33,530 DAVID MALAN: Return i, in this case. 5828 05:36:33,530 --> 05:36:35,800 So I'm going return i, not print it. 5829 05:36:35,800 --> 05:36:40,751 Because now, my main function can\n 5830 05:36:40,751 --> 05:36:42,791 and print it on the next line itself. 5831 05:36:42,791 --> 05:36:46,781 I just need a prototype, so that's\n 5832 05:36:46,780 --> 05:36:48,431 I'm going to rerun make length. 5833 05:36:48,431 --> 05:36:49,901 Hopefully I didn't screw up. 5834 05:36:49,901 --> 05:36:53,591 I didn't. ./length,\nI'll type in hi-- oops-- 5835 05:36:56,140 --> 05:36:59,230 I'll type in bye again, and so forth. 5836 05:36:59,230 --> 05:37:02,963 So now we have a function that\n 5837 05:37:02,963 --> 05:37:05,381 Well, it turns out we didn't\nactually need this all along. 5838 05:37:05,381 --> 05:37:10,302 It turns out that we can get rid of my\n 5839 05:37:10,302 --> 05:37:12,760 I can definitely delete the\nwhole implementation down here. 5840 05:37:12,760 --> 05:37:16,420 Because it turns out, in\na file called string.h 5841 05:37:16,420 --> 05:37:19,780 which is a new header file today, we\n 5842 05:37:19,780 --> 05:37:23,950 called, more succinctly,\nstrlen, S-T-R-L-E-N. Which 5843 05:37:25,390 --> 05:37:29,501 This is a function that comes with C,\n 5844 05:37:29,501 --> 05:37:33,710 and it does what we just\nimplemented manually. 5845 05:37:33,710 --> 05:37:37,600 So here's an example of, admittedly, a\n 5846 05:37:38,741 --> 05:37:41,111 And how do what kinds\nof functions exist? 5847 05:37:41,111 --> 05:37:45,521 Well, let me pop out of my\nbrowser here to a website that 5848 05:37:45,521 --> 05:37:48,716 is a CS50's incarnation of\nwhat are called manual pages. 5849 05:37:48,716 --> 05:37:52,331 It turns out that in a lot\nof systems, Macs, and Unix 5850 05:37:52,330 --> 05:37:55,360 and Linux systems, including\nthe Visual Studio Code 5851 05:37:55,361 --> 05:37:57,280 instance that we have\nin the cloud, there 5852 05:37:57,280 --> 05:38:00,550 are publicly accessible\nmanual pages for functions. 5853 05:38:00,550 --> 05:38:04,030 They tend to be written very\nexpertly, in a way that's 5854 05:38:05,420 --> 05:38:09,911 So we have here at\nmanual.cs50.io is CS50's version 5855 05:38:09,911 --> 05:38:13,001 of manual pages that have this\nless-comfortable mode that 5856 05:38:13,001 --> 05:38:15,550 give you a, sort of, cheat\nsheet of very frequently used 5857 05:38:15,550 --> 05:38:19,271 helpful functions in C. And\nwe've translated the expert 5858 05:38:19,271 --> 05:38:22,335 notation to things that a\nbeginner can understand. 5859 05:38:22,335 --> 05:38:26,451 So, for instance, let me go ahead and\n 5860 05:38:26,451 --> 05:38:30,460 You'll see that there's documentation\n 5861 05:38:30,460 --> 05:38:32,771 but more interestingly\ndown here, there's 5862 05:38:32,771 --> 05:38:35,111 a whole bunch of\nstring-related functions 5863 05:38:35,111 --> 05:38:36,881 that we haven't even seen most of, yet. 5864 05:38:36,881 --> 05:38:38,921 But there's indeed one\nhere called strlen 5865 05:38:38,920 --> 05:38:40,880 calculate the length of a string. 5866 05:38:40,881 --> 05:38:46,421 And so if I go to strlen here, I'll\n 5867 05:38:47,230 --> 05:38:49,661 And the way a manual\npage typically works 5868 05:38:49,661 --> 05:38:52,570 whether in CS50's format\nor any other, system 5869 05:38:52,570 --> 05:38:55,210 is you see, typically, a\nsynopsis of what header 5870 05:38:55,210 --> 05:38:57,591 files you need to use the function. 5871 05:38:57,591 --> 05:39:00,221 So you would copy paste\nthese couple of lines here. 5872 05:39:00,221 --> 05:39:03,791 You see what the prototype\nis of the function so 5873 05:39:03,791 --> 05:39:06,794 that you know what its inputs are,\n 5874 05:39:06,794 --> 05:39:09,461 Then down below you might see a\ndescription, which in this case 5875 05:39:10,580 --> 05:39:12,430 This function calculates\nthe length of s. 5876 05:39:12,431 --> 05:39:15,370 Then you see what the\nreturn value is, if any 5877 05:39:15,370 --> 05:39:18,570 and you might even see an example, like\n 5878 05:39:18,570 --> 05:39:21,273 So these manual pages\nwhich are again, accessible 5879 05:39:21,273 --> 05:39:23,980 here, and we'll link to these in\n 5880 05:39:23,980 --> 05:39:26,771 are pretty much the place to\nstart when you want to figure out 5881 05:39:26,771 --> 05:39:29,471 has a wheel been invented already? 5882 05:39:29,471 --> 05:39:32,751 Is there a function that might help\n 5883 05:39:32,751 --> 05:39:36,161 so that I don't have to really\nget into the weeds of doing all 5884 05:39:36,161 --> 05:39:37,973 of those lower-level steps as I've had. 5885 05:39:37,973 --> 05:39:40,931 Sometimes the answer is going to be\n 5886 05:39:40,931 --> 05:39:43,421 But again the point of our\nhaving just done this together 5887 05:39:43,420 --> 05:39:46,210 is to reveal that even the\nfunctions you start taking for 5888 05:39:46,210 --> 05:39:50,396 granted, they all reduce to some\nof these basic building blocks. 5889 05:39:50,396 --> 05:39:53,861 At the end of the day, this is\n 5890 05:39:55,210 --> 05:39:57,320 We're just learning,\nnow, how to harness those 5891 05:39:57,320 --> 05:40:01,480 and how to manipulate them ourselves. 5892 05:40:08,065 --> 05:40:16,039 AUDIENCE: We did just see\n[INAUDIBLE] Is that so common 5893 05:40:16,040 --> 05:40:18,296 that we would have to\nspecify it, or is it not? 5894 05:40:18,295 --> 05:40:19,420 DAVID MALAN: Good question. 5895 05:40:19,420 --> 05:40:22,180 Is it so common that you would\nhave to specify it or not? 5896 05:40:22,181 --> 05:40:24,431 You do need to include its\nheader files because that's 5897 05:40:24,431 --> 05:40:25,931 where all of those prototypes are. 5898 05:40:25,931 --> 05:40:29,451 You don't need to worry about\nlinking it in with -l anything. 5899 05:40:29,451 --> 05:40:31,601 And in fact, moving\nforward, you do not ever 5900 05:40:31,600 --> 05:40:35,170 need to worry about linking in\n 5901 05:40:35,170 --> 05:40:39,200 We, the staff, have configured make to\n 5902 05:40:39,201 --> 05:40:41,291 We want you to understand\nthat it is doing it 5903 05:40:41,291 --> 05:40:43,601 but we'll take care of\nall of the -l's for you. 5904 05:40:43,600 --> 05:40:47,620 But the onus is on you for the\nprototypes and the header files. 5905 05:40:47,620 --> 05:40:51,411 Other questions on these\nrepresentations or techniques? 5906 05:40:51,911 --> 05:41:00,181 AUDIENCE: [INAUDIBLE] exclamation mark. 5907 05:41:00,181 --> 05:41:04,784 How does it actually define\nthe spaces [INAUDIBLE]?? 5908 05:41:04,784 --> 05:41:06,181 DAVID MALAN: A good question. 5909 05:41:06,181 --> 05:41:09,960 If you were to have a string with actual\n 5910 05:41:09,960 --> 05:41:11,791 what would the computer actually do? 5911 05:41:11,791 --> 05:41:14,221 Well for this. let me\ngo to asciichart.com. 5912 05:41:14,221 --> 05:41:19,140 Which is just a random website that's\n 5913 05:41:20,190 --> 05:41:22,780 This is, in fact, what we had\na screenshot of the other day. 5914 05:41:22,780 --> 05:41:26,348 And if you look here, it's a little\n 5915 05:41:26,348 --> 05:41:29,640 If a computer were to store a space, it\n 5916 05:41:29,640 --> 05:41:34,690 32, or technically, the pattern of 0s\n 5917 05:41:34,690 --> 05:41:37,501 All of the US English keys that\nyou might type on a keyboard 5918 05:41:37,501 --> 05:41:40,651 can be represented with a\nnumber, and using Unicode can 5919 05:41:40,651 --> 05:41:43,181 you express even things like\nemojis and other languages. 5920 05:41:43,681 --> 05:41:47,390 AUDIENCE: Are only strings\nfollowed by nul number 5921 05:41:47,390 --> 05:41:50,776 or let's say we had a series of\nnumbers, would each one of them 5922 05:41:52,105 --> 05:41:53,230 DAVID MALAN: Good question. 5923 05:41:53,230 --> 05:41:56,050 Only strings are accompanied\nby nuls at the end 5924 05:41:56,050 --> 05:41:59,021 because every other data type\nwe've talked about thus far 5925 05:41:59,021 --> 05:42:01,390 is of well defined finite length. 5926 05:42:01,390 --> 05:42:04,451 1 byte for char, 4 bytes\nfor ints and so forth. 5927 05:42:04,451 --> 05:42:08,501 If we think back to last week, we did\n 5928 05:42:08,501 --> 05:42:12,341 Integer overflow, because 4 bytes, heck,\n 5929 05:42:12,341 --> 05:42:14,530 We also talked about\nfloating point imprecision. 5930 05:42:14,530 --> 05:42:17,740 Thankfully in the world of scientific\n 5931 05:42:17,741 --> 05:42:21,191 there are libraries you can\nuse that draw inspiration 5932 05:42:21,190 --> 05:42:23,080 from this idea of a\nstring, and they might 5933 05:42:23,080 --> 05:42:26,900 use 9 bytes for an integer\nvalue or maybe 20 bytes 5934 05:42:26,901 --> 05:42:28,431 that you can count really high. 5935 05:42:28,431 --> 05:42:30,940 But they will then start to\nmanage that memory for you 5936 05:42:30,940 --> 05:42:34,220 and what they're really probably doing\n 5937 05:42:34,221 --> 05:42:37,331 and somehow remembering how\nlong the sequence of bytes is. 5938 05:42:37,330 --> 05:42:40,450 That's how these higher-level\nlibraries work, too. 5939 05:42:40,451 --> 05:42:41,960 All right, this has been a lot. 5940 05:42:41,960 --> 05:42:43,341 Let's take one more break here. 5941 05:42:43,341 --> 05:42:44,931 We'll do a seven-minute break here. 5942 05:42:44,931 --> 05:42:47,725 And when we come back, we'll\nflesh out a few more details. 5943 05:42:50,651 --> 05:42:55,661 So we just saw strlen as an\nexample of a function that 5944 05:42:55,661 --> 05:42:57,158 comes in the string library. 5945 05:42:57,158 --> 05:42:59,951 Let's start to take more of these\n 5946 05:42:59,951 --> 05:43:03,791 So we're not relying only on the\n 5947 05:43:03,791 --> 05:43:05,921 Let me switch over to VS Code. 5948 05:43:05,920 --> 05:43:10,300 And create a file called, say string.h. 5949 05:43:10,300 --> 05:43:12,376 to apply this lesson\nlearned, as follows. 5950 05:43:12,376 --> 05:43:19,030 Let me include cs50.h,\nstdio.h, and this new thing 5951 05:43:19,030 --> 05:43:21,521 string.h as well, at the top. 5952 05:43:21,521 --> 05:43:23,958 I'm going to do the usual\nint main(void) here. 5953 05:43:23,958 --> 05:43:26,501 And then in this program suppose,\nfor the sake of discussion 5954 05:43:26,501 --> 05:43:29,800 that I didn't know about\n%s for printf or, heck 5955 05:43:29,800 --> 05:43:33,560 maybe early on there\nwas no %s format code. 5956 05:43:33,561 --> 05:43:36,681 And so there was no easy\nway to print strings. 5957 05:43:36,681 --> 05:43:40,091 Well, at least if we know that\n 5958 05:43:40,091 --> 05:43:44,081 we could use %c as a\nworkaround, a solution to that 5959 05:43:45,681 --> 05:43:49,181 So let me ask myself for a\nstring s by using get string here 5960 05:43:49,181 --> 05:43:51,761 and I'll ask the user for some input. 5961 05:43:51,760 --> 05:43:57,520 And then, let me print out say, output\n 5962 05:43:58,721 --> 05:44:02,261 Now, the simplest way to do this, of\n 5963 05:44:02,260 --> 05:44:05,220 printf %s, and plug in\nthe s, and we're done. 5964 05:44:05,221 --> 05:44:07,991 But again, for the sake of\ndiscussion, I forgot about 5965 05:44:07,991 --> 05:44:12,081 or someone didn't implement %s,\nso how else could we do this? 5966 05:44:12,080 --> 05:44:16,060 Well, in pseudo code, or in English\n 5967 05:44:16,061 --> 05:44:23,171 this problem, printing out the string\n 5968 05:44:23,170 --> 05:44:26,680 How might we go about solving this? 5969 05:44:26,681 --> 05:44:28,407 Just in English, high-level? 5970 05:44:28,407 --> 05:44:29,990 What would your pseudo code look like? 5971 05:44:30,491 --> 05:44:33,829 AUDIENCE: You could\njust print each letter. 5972 05:44:33,829 --> 05:44:35,621 DAVID MALAN: OK, so\njust print each letter. 5973 05:44:35,620 --> 05:44:37,751 And maybe, more precisely,\nsome kind of loop. 5974 05:44:37,751 --> 05:44:41,291 Like, let's iterate over\nall of the characters in s 5975 05:44:43,550 --> 05:44:48,310 Well, for int i, get 0 is kind of the\n 5976 05:44:49,841 --> 05:44:51,626 OK, how long do I want to iterate? 5977 05:44:51,626 --> 05:44:53,501 Well, it's going to\ndepend on what I type in 5978 05:44:53,501 --> 05:44:55,561 but that's why we have strlen now. 5979 05:44:55,561 --> 05:45:00,341 So iterate up to the length of\ns, and then increment i with plus 5980 05:45:01,335 --> 05:45:04,931 And then let's just print\nout %c with no new line 5981 05:45:04,931 --> 05:45:07,271 because I want everything\non the same line 5982 05:45:07,271 --> 05:45:12,041 whatever the character\nis at s bracket i. 5983 05:45:12,041 --> 05:45:14,050 And then at the very\nend, I'll give myself 5984 05:45:14,050 --> 05:45:16,611 that new line, just to move the\ncursor down to the next line 5985 05:45:16,611 --> 05:45:18,611 so the dollar sign is\nnot in a weird place. 5986 05:45:18,611 --> 05:45:21,491 All right, so let's see if I\ndidn't screw up any of the code 5987 05:45:21,491 --> 05:45:26,951 make string, Enter, so far so good,\n 5988 05:45:30,280 --> 05:45:33,940 Let me do it once more with\nbye, Enter, and that works, too. 5989 05:45:33,940 --> 05:45:36,670 Notice I very deliberately\nand quickly gave myself 5990 05:45:36,670 --> 05:45:39,520 two spaces here and one space\nhere just because I, literally 5991 05:45:39,521 --> 05:45:42,881 wanted these things to line up properly,\n 5992 05:45:42,881 --> 05:45:46,091 But that was just a\ndeliberate formatting detail. 5993 05:45:47,780 --> 05:45:53,501 Which is a claim I've made before,\nbut it's not well-designed. 5994 05:45:53,501 --> 05:45:57,431 It is well-designed in that I'm using\n 5995 05:45:57,431 --> 05:45:59,921 like, I've not reinvented\na wheel, there's no line 15 5996 05:45:59,920 --> 05:46:02,530 or below, I didn't implement\nstring length myself. 5997 05:46:02,530 --> 05:46:07,900 So I'm at least practicing\nwhat I've preached. 5998 05:46:07,901 --> 05:46:12,621 But there's still an\nimperfection, a suboptimality. 5999 05:46:12,620 --> 05:46:15,170 This one's really subtle though. 6000 05:46:15,170 --> 05:46:18,590 And you have to think\nabout how loops work. 6001 05:46:18,591 --> 05:46:22,901 What am I doing that's\nnot super efficient? 6002 05:46:24,131 --> 05:46:27,439 AUDIENCE: [INAUDIBLE]\nover and over again. 6003 05:46:27,438 --> 05:46:29,230 DAVID MALAN: Yeah, this\nis a little subtle. 6004 05:46:29,230 --> 05:46:31,721 But if you think back to the\nbasic definition of a 4 loop 6005 05:46:31,721 --> 05:46:34,331 and recall when I highlighted\nthings last week, what happens? 6006 05:46:34,330 --> 05:46:37,090 Well, the first thing\nis that i gets set to 0. 6007 05:46:37,091 --> 05:46:38,570 Then we check the condition. 6008 05:46:38,570 --> 05:46:39,820 How do we check the condition? 6009 05:46:39,820 --> 05:46:42,640 We call strlen on s,\nwe get back an answer 6010 05:46:42,640 --> 05:46:49,070 like 3 if it's a H-I exclamation point\n 6011 05:46:49,070 --> 05:46:50,830 and then we print out the character. 6012 05:46:50,830 --> 05:46:53,320 Then we increment i from 0 to 1. 6013 05:46:54,728 --> 05:46:56,021 How do I recheck the condition? 6014 05:46:58,361 --> 05:47:01,151 Get back the same answer, 3. 6015 05:47:04,061 --> 05:47:08,951 So we print out another character. i\n 6016 05:47:11,170 --> 05:47:12,220 Well, what's the string like the best? 6017 05:47:16,120 --> 05:47:19,690 So I keep asking the same\nquestion sort of stupidly 6018 05:47:19,690 --> 05:47:22,480 because the string is, presumably,\nnever changing in length. 6019 05:47:22,480 --> 05:47:24,418 And indeed, every time\nI check that condition 6020 05:47:24,419 --> 05:47:25,961 that function is going to get called. 6021 05:47:25,960 --> 05:47:28,640 And every time, the answer\nfor hi is going to be 3. 6022 05:47:30,355 --> 05:47:35,111 So it's a marginal suboptimality,\nbut I could do better, right? 6023 05:47:35,111 --> 05:47:39,820 Don't ask multiple times questions\n 6024 05:47:39,820 --> 05:47:45,221 So how could I remember the answer to\n 6025 05:47:45,221 --> 05:47:49,011 How could I remember the\nanswer to this question? 6026 05:47:50,291 --> 05:47:51,707 AUDIENCE: Store it in a variable. 6027 05:47:51,706 --> 05:47:53,440 DAVID MALAN: So store\nit in a variable, right? 6028 05:47:53,440 --> 05:47:56,358 That's been our answer most any time\n 6029 05:47:57,381 --> 05:48:02,140 Well, I could do something like this,\n 6030 05:48:02,140 --> 05:48:05,460 Then I can just change\nthis function call. 6031 05:48:05,460 --> 05:48:07,420 Let me fix my spelling here. 6032 05:48:07,420 --> 05:48:11,620 Let me fix this to be comparing\n 6033 05:48:11,620 --> 05:48:14,501 Because now strlen is only\ncalled once on line 9. 6034 05:48:14,501 --> 05:48:17,001 And I'm reusing the value\nof that variable, a.k.a. 6035 05:48:17,001 --> 05:48:18,501 length, again, and again, and again. 6036 05:48:19,543 --> 05:48:24,021 Turns out that 4 loops let you\n 6037 05:48:24,021 --> 05:48:28,280 so we can do this a little\nmore elegantly all in one line. 6038 05:48:28,280 --> 05:48:31,030 And this is just some\nsyntactic improvement. 6039 05:48:31,030 --> 05:48:36,190 I could actually do something\nlike this, n equals strlen of s 6040 05:48:36,190 --> 05:48:39,010 and then I could just say n\nhere or I could call it length. 6041 05:48:39,010 --> 05:48:41,927 But heck, while I'm being succinct\n 6042 05:48:41,927 --> 05:48:46,361 So now it's just a marginal\nchange but I've now 6043 05:48:46,361 --> 05:48:50,291 declared two variables\ninside of my loop, i and n. 6044 05:48:50,291 --> 05:48:53,561 i is set to 0. n extends\nto the string length of s. 6045 05:48:53,561 --> 05:48:57,641 But now, hereafter, all of my condition\n 6046 05:48:57,640 --> 05:49:00,431 i less than n, and n is never changing. 6047 05:49:00,431 --> 05:49:02,269 All right, so a marginal\nimprovement there. 6048 05:49:02,269 --> 05:49:04,061 Now that I've used this\nnew function, let's 6049 05:49:04,061 --> 05:49:06,186 use some other functions\nthat might be of interest. 6050 05:49:06,186 --> 05:49:12,941 Let me write a quick program here\n 6051 05:49:12,940 --> 05:49:16,070 changes to uppercase some\nstring that the user types in. 6052 05:49:16,070 --> 05:49:19,751 So let me code a file\ncalled uppercase.c. 6053 05:49:19,751 --> 05:49:25,780 Up here I'll use my new friends,\n 6054 05:49:25,780 --> 05:49:31,330 So standard I/O, and string.h So\njust as before int main(void). 6055 05:49:31,330 --> 05:49:33,880 And then inside of main, what\nI'm going to do this time 6056 05:49:33,881 --> 05:49:38,651 is let's ask the user for a string\n 6057 05:49:39,940 --> 05:49:44,390 And then let me print\nout something like after. 6058 05:49:44,390 --> 05:49:48,670 So that it-- just so I can see what\n 6059 05:49:48,670 --> 05:49:52,870 And then after this, let me\ndo the following, for int, i 6060 05:49:52,870 --> 05:49:56,290 equals 0, oh, let's\npractice that same lesson 6061 05:49:56,291 --> 05:50:02,050 so n equals the string length of\n 6062 05:50:02,050 --> 05:50:05,861 So really, nothing\nnew, fundamentally yet. 6063 05:50:05,861 --> 05:50:11,530 How do I now convert characters from\n 6064 05:50:11,530 --> 05:50:14,260 In other words, if I type\nin hi, H-I in lowercase 6065 05:50:14,260 --> 05:50:19,751 I want my program, now, to uppercase\n 6066 05:50:19,751 --> 05:50:23,030 Well how can I go about doing this? 6067 05:50:23,030 --> 05:50:25,271 Well you might recall\nthat there is this-- 6068 05:50:25,271 --> 05:50:28,161 you might recall that\nthere is this ASCII chart. 6069 05:50:28,161 --> 05:50:31,116 So let's just consult this\nreal quick on asciichart.com. 6070 05:50:31,116 --> 05:50:35,771 We've looked at this last week\nnotice that a-- capital A is 65 6071 05:50:35,771 --> 05:50:39,701 capital B is 66, capital\nC is 67, and heck, here's 6072 05:50:39,701 --> 05:50:43,901 lowercase a, lowercase b,\nlowercase c, and that's 97, 98, 99. 6073 05:50:43,901 --> 05:50:47,241 And if I actually do some\nmath, there's a distance of 32. 6074 05:50:47,741 --> 05:50:49,901 So if I want to go from\nuppercase to lowercase 6075 05:50:49,901 --> 05:50:55,049 I can do 65 plus 32 will give me\n97 and that actually works out 6076 05:50:55,048 --> 05:50:56,591 across the board for everything else. 6077 05:50:56,591 --> 05:51:00,280 66 plus 32 gets me to 98 or lowercase b. 6078 05:51:00,280 --> 05:51:04,900 Or conversely, if you have a\nlowercase a, and its value is 97 6079 05:51:04,901 --> 05:51:11,111 subtract 32 and boom, you have capital\n 6080 05:51:11,111 --> 05:51:13,721 But now that we know that\nstrings are just arrays 6081 05:51:13,721 --> 05:51:17,591 and we know that characters,\nwhich are in those arrays 6082 05:51:17,591 --> 05:51:20,710 are just binary\nrepresentations of numbers 6083 05:51:20,710 --> 05:51:23,558 I think we can manipulate a\nfew of these things as follows. 6084 05:51:23,558 --> 05:51:25,390 Let me go back to my\nprogram here, and first 6085 05:51:25,390 --> 05:51:29,620 ask the question, if the current\n 6086 05:51:29,620 --> 05:51:33,190 is lowercase, let's\nforce it to uppercase. 6087 05:51:33,190 --> 05:51:34,510 So how am I going to do that? 6088 05:51:34,510 --> 05:51:40,720 If the character at s bracket i,\n 6089 05:51:40,721 --> 05:51:45,581 is greater than or equal to\nlowercase a, and s bracket 6090 05:51:45,580 --> 05:51:50,920 i is less than or equal to\nlowercase z, kind of a weird Boolean 6091 05:51:50,920 --> 05:51:55,720 expression but it's completely\nlegitimate, because in this array 6092 05:51:55,721 --> 05:51:58,491 s is a whole bunch of characters\nthat the humans typed in 6093 05:51:58,491 --> 05:52:01,781 because that's what a string is,\n 6094 05:52:01,780 --> 05:52:03,940 be a little nonsensical\nbecause when have you ever 6095 05:52:03,940 --> 05:52:05,591 compared numbers to letters? 6096 05:52:05,591 --> 05:52:11,829 But we know from week 0 lowercase a\n 6097 05:52:16,850 --> 05:52:20,650 And so that would allow us to answer\n 6098 05:52:21,670 --> 05:52:24,790 All right, so let me\nanswer that question. 6099 05:52:24,791 --> 05:52:27,401 If it is, what do I want to print out? 6100 05:52:27,401 --> 05:52:30,131 I don't want to print\nout the letter itself 6101 05:52:30,131 --> 05:52:33,550 I want to print out the\nletter minus 32, right? 6102 05:52:33,550 --> 05:52:37,420 Because if it happens to be a\nlowercase a, 97, 97 minus 32 6103 05:52:37,420 --> 05:52:39,790 gives me 65, which is\nuppercase A, and I know that 6104 05:52:39,791 --> 05:52:43,121 just from having stared\nat that chart in the past. 6105 05:52:43,120 --> 05:52:48,433 Else if the character is not\nbetween little a and big A 6106 05:52:48,433 --> 05:52:50,140 I'm just going to\nprint out the character 6107 05:52:50,140 --> 05:52:52,810 itself by printing s bracket i. 6108 05:52:52,811 --> 05:52:55,841 And at the very end of this, I'm\n 6109 05:52:55,841 --> 05:52:57,741 to move the cursor to the next line. 6110 05:52:57,741 --> 05:52:59,191 So again, it's a little wordy. 6111 05:52:59,190 --> 05:53:03,280 But this loop here, which I\nborrowed from our code previously 6112 05:53:03,280 --> 05:53:05,771 just iterates over the string, a.k.a. 6113 05:53:05,771 --> 05:53:08,890 array, character-by-character,\nthrough its length. 6114 05:53:08,890 --> 05:53:11,620 This line 11 here is\njust asking the question 6115 05:53:11,620 --> 05:53:15,130 if that current character,\nthe i-th character of s 6116 05:53:15,131 --> 05:53:18,161 is greater than or equal\nto little a and less 6117 05:53:18,161 --> 05:53:23,501 than or equal to little z, that\nis between 97 and 132, then 6118 05:53:23,501 --> 05:53:29,201 we're going to go ahead and\nforce it to uppercase instead. 6119 05:53:29,201 --> 05:53:33,550 All right, and let me zoom\nout here for just a second. 6120 05:53:33,550 --> 05:53:38,530 And sorry, I misspoke 122, which\nis what you might have said. 6121 05:53:41,530 --> 05:53:44,540 Let me go ahead now and\ncompile and run this program. 6122 05:53:44,541 --> 05:53:50,471 So make uppercase, ./uppercase, and\n 6123 05:53:50,471 --> 05:53:52,781 And there's the capitalized\nversion, thereof. 6124 05:53:52,780 --> 05:53:55,181 Let me do it again, with\nmy own name in lowercase 6125 05:53:55,181 --> 05:53:57,361 and now it's capitalized as well. 6126 05:53:57,361 --> 05:53:59,120 Well, what could we do to improve this? 6127 05:54:00,221 --> 05:54:01,901 Let's stop reinventing wheels. 6128 05:54:01,901 --> 05:54:04,101 Let's go to the manual pages. 6129 05:54:04,100 --> 05:54:07,751 So let me go here and search for\nsomething like, I don't know 6130 05:54:09,881 --> 05:54:12,730 I did some auto complete\nhere, our little search box 6131 05:54:12,730 --> 05:54:14,980 is saying that, OK there's\nan is-lower function 6132 05:54:14,980 --> 05:54:16,811 check whether a character is lowercase. 6133 05:54:17,901 --> 05:54:23,411 Well let me check, is lower, now I see\n 6134 05:54:27,163 --> 05:54:28,870 that's the header file\nI need to include. 6135 05:54:28,870 --> 05:54:32,830 This is the prototype for is-lower,\n 6136 05:54:35,591 --> 05:54:38,661 I feel like is-lower should\nreturn true or false. 6137 05:54:38,661 --> 05:54:42,940 So let's scroll down to the\ndescription and return value. 6138 05:54:42,940 --> 05:54:45,070 It returns, oh this is interesting. 6139 05:54:45,070 --> 05:54:49,631 And this is a convention in C. This\n 6140 05:54:49,631 --> 05:54:55,081 if C is a lowercase letter and 0\nif C is not a lowercase letter. 6141 05:54:57,491 --> 05:55:02,591 So like 1, negative 1, something that's\n 6142 05:55:02,591 --> 05:55:05,661 and 0 if it is not a lowercase letter. 6143 05:55:05,661 --> 05:55:07,420 So how can we use this building block? 6144 05:55:07,420 --> 05:55:09,490 Let me go back to my code here. 6145 05:55:09,491 --> 05:55:13,871 Let me add this file, include ctype.h. 6146 05:55:13,870 --> 05:55:17,380 And down here, let me get rid of\nthis cryptic expression, which 6147 05:55:17,381 --> 05:55:23,320 was kind of painful to come up with,\n 6148 05:55:26,230 --> 05:55:29,650 That should actually work but why? 6149 05:55:29,651 --> 05:55:34,781 Well is-lower, again, returns a non-zero\n 6150 05:55:36,411 --> 05:55:37,675 That means it could return 1. 6151 05:55:37,675 --> 05:55:38,800 It could return negative 1. 6152 05:55:38,800 --> 05:55:40,631 It could return 50 or negative 50. 6153 05:55:40,631 --> 05:55:42,911 It's actually not\nprecisely defined, why? 6154 05:55:43,960 --> 05:55:48,010 This was a common convention to\nuse 0 to represent false and use 6155 05:55:48,010 --> 05:55:50,380 any other value to represent true. 6156 05:55:50,381 --> 05:55:54,401 And so it turns out, that\ninside of Boolean expressions 6157 05:55:54,401 --> 05:55:59,016 if you put a value like a function\n 6158 05:55:59,015 --> 05:56:00,640 that's going to be equivalent to false. 6159 05:56:00,640 --> 05:56:03,236 It's like the answer\nbeing no, it is not lower. 6160 05:56:03,236 --> 05:56:06,251 But you can also, in\nparentheses, put the name 6161 05:56:06,251 --> 05:56:10,181 of the function and its arguments,\n 6162 05:56:10,181 --> 05:56:15,491 Because we could do something like\n 6163 05:56:16,508 --> 05:56:19,091 Because that's the definition,\nif it returns a non-zero value 6164 05:56:20,021 --> 05:56:23,471 But a more succinct way to do that\n 6165 05:56:23,471 --> 05:56:28,371 If it's is lower, then print\nout the character minus 32. 6166 05:56:28,370 --> 05:56:30,850 So this would be the common\nway of using one of these 6167 05:56:30,850 --> 05:56:34,286 is- functions to check if\nthe answer is true or false. 6168 05:56:37,070 --> 05:56:38,931 DAVID MALAN: OK, well we might be done. 6169 05:56:43,780 --> 05:56:47,440 It would be incorrect to check for\n 6170 05:56:47,440 --> 05:56:49,810 You want to check for the opposite of 0. 6171 05:56:51,131 --> 05:56:56,081 Or more succinctly, like I did by\n 6172 05:56:56,080 --> 05:56:58,820 Let me see what happens here. 6173 05:56:58,820 --> 05:57:02,951 So this is great, but some of you\n 6174 05:57:03,940 --> 05:57:06,490 A moment ago when we were on\nthe manual pages searching 6175 05:57:06,491 --> 05:57:09,641 for things related to lowercase,\nwhat might be another building 6176 05:57:13,420 --> 05:57:14,960 Based on what's on the screen here? 6177 05:57:18,401 --> 05:57:21,359 There's a function that would literally\n 6178 05:57:21,359 --> 05:57:24,293 so I don't have to get into the\nweeds of negative 32, plus 32. 6179 05:57:24,293 --> 05:57:25,751 I don't have to consult that chart. 6180 05:57:25,751 --> 05:57:29,381 Someone has solved this\nproblem for me in the past. 6181 05:57:29,381 --> 05:57:33,941 And let's see if I can\nactually get back to it. 6182 05:57:34,780 --> 05:57:36,800 Let me go ahead, now, and use this. 6183 05:57:36,800 --> 05:57:39,490 So instead of doing\ns bracket i minus 32 6184 05:57:39,491 --> 05:57:44,141 let's use a function that someone else\n 6185 05:57:44,681 --> 05:57:47,511 And now it's going to\ndo the solution for me. 6186 05:57:47,510 --> 05:57:54,790 So if I rerun make uppercase, and then\n 6187 05:57:54,791 --> 05:57:56,381 now it's working as expected. 6188 05:57:56,381 --> 05:58:00,131 And honestly, if I read the\ndocumentation for to-upper 6189 05:58:00,131 --> 05:58:03,431 by going back to its man page,\nor manual page, what you'll see 6190 05:58:03,431 --> 05:58:08,681 is that it says if it's lowercase,\n 6191 05:58:09,311 --> 05:58:13,174 If it's not lowercase, it's already\nuppercase, it's punctuation 6192 05:58:13,173 --> 05:58:14,966 it will just return\nthe original character. 6193 05:58:14,966 --> 05:58:18,161 Which means, thanks to this\nfunction, I can actually 6194 05:58:18,161 --> 05:58:21,911 tighten this up significantly,\nget rid of all of my conditional 6195 05:58:21,911 --> 05:58:26,291 there, and just print out\nthe to-upper return value 6196 05:58:26,291 --> 05:58:29,320 and leave it to whoever wrote\nthat function to figure out 6197 05:58:29,320 --> 05:58:33,730 if something's uppercase or lowercase. 6198 05:58:33,730 --> 05:58:38,080 All right, questions on\nthese kinds of tricks? 6199 05:58:38,080 --> 05:58:41,350 Again, it all reduces to\nweek 0 basics, but we're just 6200 05:58:41,350 --> 05:58:43,010 building these abstractions on top. 6201 05:58:43,510 --> 05:58:45,468 AUDIENCE: I'm wondering\nif there's any way just 6202 05:58:45,469 --> 05:58:49,370 to import all packages under\na certain subdomain instead 6203 05:58:49,370 --> 05:58:51,380 of having to do multiple\n[INAUDIBLE] statements 6204 05:58:51,381 --> 05:58:52,673 kind of like a star [INAUDIBLE] 6205 05:58:54,440 --> 05:58:57,380 There is no easy way in C\nto say, give me everything. 6206 05:58:57,381 --> 05:58:59,931 That was for, historically,\nperformance reasons. 6207 05:58:59,931 --> 05:59:03,201 They want you to be explicit\nas to what you want to include. 6208 05:59:03,201 --> 05:59:05,991 In other languages like\nPython, Java, one of which 6209 05:59:05,991 --> 05:59:08,774 we'll see later this term, you\ncan say, give me everything. 6210 05:59:08,774 --> 05:59:11,691 But that, actually, tends to be best\n 6211 05:59:11,690 --> 05:59:14,260 execution or compilation of your code. 6212 05:59:14,760 --> 05:59:17,105 AUDIENCE: Does to-upper\naccommodate for special characters? 6213 05:59:17,600 --> 05:59:20,240 Does to-upper accommodate special\ncharacters like punctuation? 6214 05:59:20,741 --> 05:59:22,701 If I read the documentation\nmore pedantically 6215 05:59:23,971 --> 05:59:27,201 It will properly hand me\nback an exclamation point 6216 05:59:28,861 --> 05:59:33,230 So if I do make uppercase here,\nand let me do ./upper, sorry-- 6217 05:59:33,230 --> 05:59:37,881 ./uppercase, hi with an exclamation\n 6218 05:59:37,881 --> 05:59:40,070 pass it through unchanged Yeah? 6219 05:59:40,070 --> 05:59:43,460 AUDIENCE: Do we access to a\nfunction that would do all of that 6220 05:59:43,460 --> 05:59:45,850 but just to the screen\nrather than to [INAUDIBLE] 6221 05:59:45,850 --> 05:59:47,810 DAVID MALAN: Really good question, too. 6222 05:59:47,811 --> 05:59:52,371 No, we do not have access to a function\n 6223 05:59:52,370 --> 05:59:56,001 with CS50's library that will just\n 6224 05:59:56,001 --> 05:59:58,431 In C, that's actually\neasier said than done. 6225 05:59:59,811 --> 06:00:04,070 So stay tuned for another language\n 6226 06:00:04,070 --> 06:00:06,771 All right, so what does\nthis leave us with? 6227 06:00:06,771 --> 06:00:08,780 There's just a-- let's\ncome full circle now 6228 06:00:08,780 --> 06:00:11,751 to where we began today where we\n 6229 06:00:12,350 --> 06:00:16,070 Recall that we talked about rm\ntaking command line argument. 6230 06:00:16,070 --> 06:00:18,730 The file you want to delete,\nwe talked about clang 6231 06:00:18,730 --> 06:00:20,480 taking command line\narguments, that again 6232 06:00:20,480 --> 06:00:22,400 modify the behavior of the program. 6233 06:00:22,401 --> 06:00:25,941 How is it that maybe you and I\ncan start to write programs that 6234 06:00:25,940 --> 06:00:28,100 actually take command line arguments? 6235 06:00:28,100 --> 06:00:31,880 Well here is where I\ncan finally explain why 6236 06:00:31,881 --> 06:00:35,001 we've been typing int\nmain(void) for the past week 6237 06:00:35,001 --> 06:00:38,751 and just asking that you take on faith\n 6238 06:00:38,751 --> 06:00:45,081 Well, by default in C, at least\n 6239 06:00:45,080 --> 06:00:48,270 there's only two official\nways to write main functions. 6240 06:00:48,271 --> 06:00:50,721 You might see other formats\nonline, but they're generally 6241 06:00:50,721 --> 06:00:53,131 not consistent with the\ncurrent specification. 6242 06:00:53,131 --> 06:00:56,421 This, again, was sort of a\nboilerplate for the simplest 6243 06:00:56,420 --> 06:00:59,030 function we might write last\nweek, and recall that we've 6244 06:00:59,030 --> 06:01:00,470 been doing this the whole time. 6245 06:01:00,471 --> 06:01:05,251 (Void) What that (void) means, for all\n 6246 06:01:05,251 --> 06:01:08,151 and you have written thus far,\nis that none of our programs 6247 06:01:08,151 --> 06:01:11,301 that we've written take\ncommand line arguments. 6248 06:01:11,300 --> 06:01:13,370 That's what the void there means. 6249 06:01:13,370 --> 06:01:18,210 It turns out that main is the way you\n 6250 06:01:18,210 --> 06:01:20,001 in fact, take command\nline arguments, that 6251 06:01:20,001 --> 06:01:24,021 is words after the command\nin your terminal window. 6252 06:01:24,021 --> 06:01:26,480 If you want to actually not\nuse get int or get string 6253 06:01:26,480 --> 06:01:30,230 you want the human to be able to\n 6254 06:01:31,100 --> 06:01:34,200 And just run-- print\nhello, David on the screen. 6255 06:01:34,201 --> 06:01:38,721 You can use command line arguments,\nwords after the program name 6256 06:01:41,010 --> 06:01:44,720 So we're going to change this in a\n 6257 06:01:44,721 --> 06:01:48,191 but something that's now a bit\nmore familiar syntactically. 6258 06:01:48,190 --> 06:01:52,700 If you change that (void) in main\n 6259 06:01:52,701 --> 06:01:57,741 int, argc, comma, string, argv,\nopen bracket, close bracket 6260 06:01:57,741 --> 06:02:00,891 you are now giving yourself\naccess to writing programs 6261 06:02:00,890 --> 06:02:03,170 that take command line arguments. 6262 06:02:03,170 --> 06:02:06,380 Argc, which stands for\nargument count is going 6263 06:02:06,381 --> 06:02:10,671 to be an integer that stores how many\n 6264 06:02:10,670 --> 06:02:13,310 The C automatically gives that to you. 6265 06:02:13,311 --> 06:02:16,971 String argv stands for\nargument vector, that's 6266 06:02:16,971 --> 06:02:21,361 going to be an array of all of the words\n 6267 06:02:21,361 --> 06:02:23,390 So with today's building\nblock of an array 6268 06:02:23,390 --> 06:02:26,240 we have the ability now to let\nthe humans type as many words 6269 06:02:26,241 --> 06:02:28,161 or as few words, as\nthey want at the prompt. 6270 06:02:28,161 --> 06:02:31,161 C is going to automatically put\nthem in an array called argv 6271 06:02:31,161 --> 06:02:36,620 and it's going to tell us how many\n 6272 06:02:36,620 --> 06:02:40,320 The int, as the return type here,\n 6273 06:02:40,320 --> 06:02:43,611 Let's use this definition\nto make, maybe 6274 06:02:43,611 --> 06:02:45,230 just a couple of simple programs. 6275 06:02:45,230 --> 06:02:47,330 But in problem set 2\nwill we actually use 6276 06:02:47,330 --> 06:02:50,730 this to control the\nbehavior of your own code. 6277 06:02:50,730 --> 06:02:57,381 Let me code up a file called\nargv.0 just to keep it aptly named. 6278 06:02:59,960 --> 06:03:01,501 Let me go ahead and include-- 6279 06:03:02,001 --> 06:03:05,210 That is not the right name of a\nprogram, let's start that over. 6280 06:03:05,210 --> 06:03:09,710 Let's go ahead and code up argv.c. 6281 06:03:11,061 --> 06:03:17,151 include cs50.h, include\nstdio.h, int, main, not void 6282 06:03:17,151 --> 06:03:24,286 let's actually say int, argc, string,\n 6283 06:03:24,286 --> 06:03:26,661 No numbers in between because\nyou don't know, in advance 6284 06:03:26,661 --> 06:03:29,570 how many words the human's\ngoing to type at their prompt. 6285 06:03:29,570 --> 06:03:31,021 Now let's go ahead and do this. 6286 06:03:31,021 --> 06:03:35,061 Let's write a very simple program that\n 6287 06:03:35,061 --> 06:03:36,921 whoever the name is that gets typed. 6288 06:03:36,920 --> 06:03:40,520 But not using get string, let's\ninstead have the human just 6289 06:03:40,521 --> 06:03:44,151 type their name at the prompt, just like\n 6290 06:03:44,151 --> 06:03:46,431 so it's just one and\ndone when you hit Enter. 6291 06:03:47,870 --> 06:03:52,640 Let me go ahead then and do this,\nprintf, quote-unquote, hello 6292 06:03:52,640 --> 06:03:55,760 comma, and instead of world\ntoday, I want to print out 6293 06:03:55,760 --> 06:03:57,630 whatever the human typed in. 6294 06:03:57,631 --> 06:04:03,111 So let's go ahead and do\nthis, argv, bracket 0 for now. 6295 06:04:03,111 --> 06:04:07,341 But I don't think this is quite\nwhat I want because, of course 6296 06:04:07,341 --> 06:04:12,631 that's going to literally print\nout argv, bracket, 0, bracket. 6297 06:04:12,631 --> 06:04:16,771 I need a placeholder, so let me\n 6298 06:04:16,771 --> 06:04:20,780 So if argv is an array, but\nit's an array of strings 6299 06:04:20,780 --> 06:04:24,740 then argv bracket 0 is\nitself a single string. 6300 06:04:24,741 --> 06:04:27,711 And so it can be plugged\ninto that %s placeholder. 6301 06:04:27,710 --> 06:04:30,001 Let me go ahead and save my program. 6302 06:04:30,001 --> 06:04:33,600 And compile argv, so far, so good. 6303 06:04:33,600 --> 06:04:37,431 Let me now type in my name\nafter the name of the program. 6304 06:04:38,241 --> 06:04:42,541 I'm literally typing an extra word,\n 6305 06:04:42,541 --> 06:04:45,550 OK, it's apparently a little\nbuggy in a couple of ways. 6306 06:04:45,550 --> 06:04:48,760 I forgot my /n but\nthat's not a huge deal. 6307 06:04:48,760 --> 06:04:53,220 But apparently, inside of\nargv is literally everything 6308 06:04:53,221 --> 06:04:55,531 that humans typed in including\nthe name of the program. 6309 06:04:55,530 --> 06:05:00,510 So logically, how do I print out hello,\n 6310 06:05:00,510 --> 06:05:01,980 the actual name of the program? 6311 06:05:03,721 --> 06:05:05,311 AUDIENCE: Change the index to 1. 6312 06:05:06,061 --> 06:05:10,201 So presumably index to 1, if that's\n 6313 06:05:11,201 --> 06:05:15,671 So let's do make argv\nagain, ./argv, Enter. 6314 06:05:17,890 --> 06:05:19,951 So this is another form of nul. 6315 06:05:19,951 --> 06:05:23,581 But this is user error, now, on my part. 6316 06:05:23,580 --> 06:05:25,330 I didn't do exactly what I said I would. 6317 06:05:25,830 --> 06:05:26,790 AUDIENCE: You forgot the parameter. 6318 06:05:26,791 --> 06:05:28,691 DAVID MALAN: Yeah, I\nforgot the parameter. 6319 06:05:29,960 --> 06:05:31,710 I should probably deal\nwith that, somehow 6320 06:05:31,710 --> 06:05:33,552 so that people aren't\nbreaking my program 6321 06:05:33,552 --> 06:05:35,260 and printing out random\nthings, like nul. 6322 06:05:35,260 --> 06:05:39,030 But if I do say argv, David,\nnow you see hello, David. 6323 06:05:39,030 --> 06:05:42,330 I can get a little curious,\nlike what's at location 2? 6324 06:05:42,330 --> 06:05:47,670 Well we can see, make argv,\nbracket, ./argv, David, Enter. 6325 06:05:47,670 --> 06:05:49,170 All right, so just nothing is there. 6326 06:05:49,170 --> 06:05:52,462 But it turns out, in a couple of weeks,\n 6327 06:05:52,462 --> 06:05:54,570 and see if we can't crash\nprograms deliberately 6328 06:05:54,570 --> 06:05:57,061 because nothing is\nstopping me from saying 6329 06:05:57,061 --> 06:06:00,730 oh what's at location 2\nmillion, for instance? 6330 06:06:00,730 --> 06:06:02,611 We could really start to get curious. 6331 06:06:02,611 --> 06:06:04,681 But for now, we'll do the right thing. 6332 06:06:04,681 --> 06:06:08,620 But let's now make sure the human has\n 6333 06:06:08,620 --> 06:06:15,181 So let's say this, if argc equals\n 6334 06:06:15,181 --> 06:06:19,021 and one more word after that, go\nahead and trust that in argv 1 6335 06:06:19,021 --> 06:06:21,241 as you proposed, is the person's name. 6336 06:06:21,241 --> 06:06:26,071 Else, let's go ahead and default\n 6337 06:06:26,070 --> 06:06:30,120 like, well, if we don't get a name\n 6338 06:06:31,561 --> 06:06:34,306 So now we're programming defensively. 6339 06:06:34,306 --> 06:06:37,350 This time the human, even if they\n 6340 06:06:37,350 --> 06:06:40,225 or they give us too many names,\n 6341 06:06:40,225 --> 06:06:42,150 because I now have some\nerror handling here. 6342 06:06:42,151 --> 06:06:46,291 Because, again, argc is argument\n 6343 06:06:51,001 --> 06:06:52,800 Let me make the same mistake as before. 6344 06:06:53,311 --> 06:06:55,171 I don't get this weird nul behavior. 6345 06:06:55,170 --> 06:06:56,610 I get something well-defined. 6346 06:06:57,870 --> 06:07:01,111 I could do David Malan, but\nthat's not currently supported. 6347 06:07:01,111 --> 06:07:05,550 I would need to alter my logic to\n 6348 06:07:06,605 --> 06:07:08,030 So what's the point of this? 6349 06:07:08,030 --> 06:07:09,780 At the moment, it's\njust a simple exercise 6350 06:07:09,780 --> 06:07:14,962 to actually give myself a way of taking\n 6351 06:07:14,962 --> 06:07:16,920 Because, consider, it's\njust more convenient in 6352 06:07:16,920 --> 06:07:18,930 this new, command-line-interface world. 6353 06:07:18,931 --> 06:07:23,118 If you had to use get string\nevery time you compile your code 6354 06:07:23,117 --> 06:07:24,450 it'd be kind of annoying, right? 6355 06:07:24,451 --> 06:07:28,201 You type make, then you might get a\n 6356 06:07:28,201 --> 06:07:31,951 Then you type in hello, or cash, or\n 6357 06:07:31,951 --> 06:07:33,591 it just really slows the process. 6358 06:07:33,591 --> 06:07:35,701 But in this\ncommand-line-interface world 6359 06:07:35,701 --> 06:07:39,031 if you support command line arguments,\n 6360 06:07:39,030 --> 06:07:42,431 Like, scrolling up and down in\n 6361 06:07:42,431 --> 06:07:46,690 You can just type commands more quickly\n 6362 06:07:46,690 --> 06:07:49,260 And you don't have to keep\nprompting the user, more 6363 06:07:49,260 --> 06:07:52,020 pedantically, for more and more info. 6364 06:07:52,021 --> 06:07:54,541 So any questions then on\ncommand line arguments? 6365 06:07:54,541 --> 06:07:58,261 Which, finally, reveals why\nwe had (void) initially 6366 06:07:58,260 --> 06:08:00,870 but what more we can now put in main. 6367 06:08:00,870 --> 06:08:03,330 That's how you take\ncommand line arguments. 6368 06:08:04,760 --> 06:08:06,870 AUDIENCE: If you were to put-- 6369 06:08:06,870 --> 06:08:11,580 if you were to use argv, and you\n 6370 06:08:11,580 --> 06:08:14,183 would it still give you, like, a string? 6371 06:08:14,184 --> 06:08:15,766 Would that still be considered string? 6372 06:08:15,766 --> 06:08:17,184 Or would you consider [INAUDIBLE]? 6373 06:08:18,021 --> 06:08:20,811 If you were to type at\nthe command line something 6374 06:08:20,811 --> 06:08:24,921 like, not a word, but\nsomething like the number 42 6375 06:08:24,920 --> 06:08:27,710 that would actually be\ntreated as a string. 6376 06:08:28,550 --> 06:08:30,480 Because again, context matters. 6377 06:08:30,480 --> 06:08:33,201 So if your program is\ncurrently manipulating memory 6378 06:08:33,201 --> 06:08:36,771 as though its characters or strings,\n 6379 06:08:36,771 --> 06:08:41,061 are, they will be interpreted\nas ASCII text, or Unicode text. 6380 06:08:41,061 --> 06:08:44,901 If we therefore go to the chart here,\n 6381 06:08:44,901 --> 06:08:48,771 then how do you distinguish numbers\n 6382 06:08:50,151 --> 06:08:58,641 Well, notice 65 is a, 97 is a,\nbut also 49 is 1, and 50 is 2. 6383 06:08:58,640 --> 06:09:01,760 So the designers of ASCII,\nand then later Unicode 6384 06:09:01,760 --> 06:09:04,940 realized well wait a minute,\nif we want to support programs 6385 06:09:04,940 --> 06:09:07,700 that let you type things\nthat look like numbers 6386 06:09:07,701 --> 06:09:10,611 even though they're not\ntechnically ints or floats 6387 06:09:10,611 --> 06:09:14,881 we need a way in ASCII and\nUnicode to represent even numbers. 6388 06:09:16,131 --> 06:09:19,471 And it's a little silly that we have\n 6389 06:09:19,471 --> 06:09:22,123 But again, if you're in the\nworld of letters and characters 6390 06:09:22,123 --> 06:09:24,291 you've got to come up with\na mapping for everything. 6391 06:09:24,291 --> 06:09:26,050 And notice here, here's the dot. 6392 06:09:26,050 --> 06:09:30,650 Even if you were to represent 1.23\n 6393 06:09:30,651 --> 06:09:35,101 even the dot now is going to be\n 6394 06:09:35,100 --> 06:09:37,190 So again, context here matters. 6395 06:09:37,190 --> 06:09:41,630 All right, one final example\nto tease apart what this int is 6396 06:09:41,631 --> 06:09:44,101 and what it's been\ndoing here for so long. 6397 06:09:44,100 --> 06:09:49,040 So I'm going to add one\nbit of logic to a new file 6398 06:09:49,041 --> 06:09:52,011 that I'm going to call exit.c. 6399 06:09:53,390 --> 06:09:57,140 We're going to introduce something that\n 6400 06:09:57,140 --> 06:09:59,240 It turns out this is not\na feature we've used yet 6401 06:09:59,241 --> 06:10:01,501 but it's just useful to know about. 6402 06:10:01,501 --> 06:10:04,611 Especially when automating\ntests of your own code. 6403 06:10:04,611 --> 06:10:08,376 When it comes to figuring out if\na program succeeded or failed. 6404 06:10:08,376 --> 06:10:13,131 It turns out that main has one\n 6405 06:10:13,131 --> 06:10:18,591 An ability to signal to the user\n 6406 06:10:18,591 --> 06:10:22,021 And that's by way of\nmain's return value. 6407 06:10:22,021 --> 06:10:26,320 So I'm going modify this\nprogram as follows, like this. 6408 06:10:26,320 --> 06:10:29,181 Suppose I want to write\na similar program that 6409 06:10:29,181 --> 06:10:32,161 requires that the user\ntype a word at the prompt. 6410 06:10:32,161 --> 06:10:36,710 So that argc has to be 2\nfor whatever design purpose. 6411 06:10:36,710 --> 06:10:43,251 If argc does not equal 2, I want to\n 6412 06:10:43,251 --> 06:10:46,850 I want to insist that the user\noperate the program correctly. 6413 06:10:46,850 --> 06:10:53,060 So I might give them an error message\n 6414 06:10:53,061 --> 06:10:55,441 But now I want to quit\nout of the program. 6415 06:10:56,570 --> 06:11:01,521 The right way, quote-unquote, to do\n 6416 06:11:01,521 --> 06:11:04,851 Now it's a little weird\nbecause no one called main yet 6417 06:11:04,850 --> 06:11:07,251 right, main just gets\ncalled automatically 6418 06:11:07,251 --> 06:11:09,561 but the convention is\nanytime something goes 6419 06:11:09,561 --> 06:11:14,361 wrong in a program you should\nreturn a non-zero value from main. 6420 06:11:16,041 --> 06:11:19,730 We don't need to get into the weeds of\n 6421 06:11:20,480 --> 06:11:26,030 But if you return 1, that is a clue to\n 6422 06:11:26,030 --> 06:11:27,690 device that's something went wrong. 6423 06:11:29,931 --> 06:11:35,721 If everything works fine, like, let's go\n 6424 06:11:35,721 --> 06:11:40,881 before, quote-unquote argv bracket 1. 6425 06:11:40,881 --> 06:11:43,341 So this is just a version of\nthe program without an else. 6426 06:11:43,341 --> 06:11:45,651 So this is the same\nas doing, essentially 6427 06:11:45,651 --> 06:11:47,841 an else here like I did earlier. 6428 06:11:47,841 --> 06:11:51,001 I want to signal to the\ncomputer that all is well. 6429 06:11:52,550 --> 06:11:55,911 But strictly speaking, if\nI'm already returning here 6430 06:11:55,911 --> 06:11:58,820 I don't technically need, if\nI really want to be nit picky 6431 06:11:58,820 --> 06:12:01,131 I don't technically need the\nelse because the only way 6432 06:12:01,131 --> 06:12:05,747 I'm going to get to line 11\nis if I didn't already return. 6433 06:12:07,440 --> 06:12:10,790 The only new thing here logically,\n 6434 06:12:10,791 --> 06:12:13,070 I'm returning a value from main. 6435 06:12:13,070 --> 06:12:14,990 That's something I\ncould always have done 6436 06:12:14,991 --> 06:12:19,551 because main has always been defined by\n 6437 06:12:19,550 --> 06:12:24,140 By default, main automatically,\n 6438 06:12:24,140 --> 06:12:27,111 If you've never once use the\nreturn keyword, which you probably 6439 06:12:27,111 --> 06:12:29,631 haven't in main, it just\nautomatically returns 0 6440 06:12:29,631 --> 06:12:31,556 and the system assumes\nthat all went well. 6441 06:12:31,556 --> 06:12:33,651 But now that we're starting\nto get a little more 6442 06:12:33,651 --> 06:12:35,781 sophisticated with our\ncode, and you know 6443 06:12:35,780 --> 06:12:39,740 the programmer, something went\n 6444 06:12:39,741 --> 06:12:44,871 You can exit out of them by returning\n 6445 06:12:44,870 --> 06:12:47,300 And this is fortuitous\nthat it's an int, right? 6446 06:12:49,370 --> 06:12:53,510 Unfortunately, in programming, there are\n 6447 06:12:54,501 --> 06:12:57,471 And int gives you 4\nbillion possible codes 6448 06:12:57,471 --> 06:13:00,716 that you can use, a.k.a. exit\nstatuses, to signify errors. 6449 06:13:00,716 --> 06:13:04,190 So if you've ever on your Mac\nor PC gotten some weird pop up 6450 06:13:04,190 --> 06:13:07,580 that an error happened, sometimes,\n 6451 06:13:07,580 --> 06:13:09,680 Maybe it's positive,\nmaybe it's negative. 6452 06:13:09,681 --> 06:13:14,431 It might say error code 123, or\n 6453 06:13:14,431 --> 06:13:18,570 What you're generally seeing, are\n 6454 06:13:18,570 --> 06:13:21,870 values from main in a program\nthat someone at Microsoft 6455 06:13:21,870 --> 06:13:25,380 or Apple, or somewhere else\nwrote, something went wrong 6456 06:13:25,381 --> 06:13:30,241 they are unnecessarily showing you,\n 6457 06:13:30,241 --> 06:13:33,361 If only, so that when you call\n 6458 06:13:33,361 --> 06:13:36,451 you can tell them what exit\nstatus you encountered 6459 06:13:36,451 --> 06:13:39,331 what error code you encounter. 6460 06:13:39,330 --> 06:13:43,650 All right, any questions\non exit statuses 6461 06:13:43,651 --> 06:13:48,841 which is the last of our new\nbuilding blocks, for now? 6462 06:13:50,300 --> 06:13:57,800 AUDIENCE: [INAUDIBLE] You know how\n 6463 06:13:57,800 --> 06:13:59,679 if you want to make [INAUDIBLE] 6464 06:14:00,346 --> 06:14:03,526 The question is can you\ndo things again and again 6465 06:14:03,526 --> 06:14:06,151 at the command line like you\ncould with get string and get int. 6466 06:14:06,151 --> 06:14:08,131 Which, by default,\nrecall are automatically 6467 06:14:08,131 --> 06:14:10,681 designed to keep prompting\nthe user in their own loop 6468 06:14:10,681 --> 06:14:14,221 until they give you an int, or a\n 6469 06:14:15,001 --> 06:14:16,471 You're going to get an\nerror message but then 6470 06:14:16,471 --> 06:14:18,263 you're going to be\nreturned to your prompt. 6471 06:14:18,262 --> 06:14:21,647 And it's up to you to type\nit correctly the next time. 6472 06:14:22,730 --> 06:14:27,695 AUDIENCE: [INAUDIBLE]\nautomatically for you. 6473 06:14:27,695 --> 06:14:29,570 DAVID MALAN: If you\ndo not return a value 6474 06:14:29,570 --> 06:14:32,990 explicitly main will\nautomatically return 0 for you 6475 06:14:32,991 --> 06:14:36,901 that is the way C simply works\nso it's not strictly necessary. 6476 06:14:36,901 --> 06:14:39,771 But now that we're starting\nto return values explicitly 6477 06:14:39,771 --> 06:14:42,351 if something goes wrong,\nit would be good practice 6478 06:14:42,350 --> 06:14:45,740 to also start returning a value\n 6479 06:14:48,036 --> 06:14:52,070 So let's now get out of\nthe weeds and contextualize 6480 06:14:52,070 --> 06:14:55,460 this for some actual problems that\n 6481 06:14:55,460 --> 06:14:57,390 by way of problems set 2 and beyond. 6482 06:15:00,001 --> 06:15:04,251 So here for instance, is a\nproblem that you might think back 6483 06:15:04,251 --> 06:15:08,241 to when you were a kid the\n 6484 06:15:08,241 --> 06:15:10,491 the grade level in which\nsome book is written. 6485 06:15:10,491 --> 06:15:14,001 If you're a young student, you\nmight read at first-grade level 6486 06:15:14,001 --> 06:15:15,501 or third-grade level in the US. 6487 06:15:15,501 --> 06:15:17,293 Or, if you're in college\npresumably, you're 6488 06:15:17,293 --> 06:15:19,206 reading at a university-level of text. 6489 06:15:19,205 --> 06:15:22,333 But what does it mean\nfor text, like in a book 6490 06:15:22,333 --> 06:15:24,501 or in an essay, or something\nlike that to correspond 6491 06:15:24,501 --> 06:15:25,850 to some kind of grade level? 6492 06:15:25,850 --> 06:15:29,210 Well, here's a quote-- a\ntitle of a childhood book. 6493 06:15:29,210 --> 06:15:31,850 One Fish, Two Fish, Red Fish, Blue Fish. 6494 06:15:31,850 --> 06:15:35,100 What might the grade level be for\n 6495 06:15:35,100 --> 06:15:37,850 Maybe, when you were a kid or if\n 6496 06:15:37,850 --> 06:15:40,520 these things, what might the\ngrade level of this thing be? 6497 06:15:47,643 --> 06:15:49,911 DAVID MALAN: Before grade\n1 is, in fact, correct. 6498 06:15:49,911 --> 06:15:51,550 So that's for really young kids? 6499 06:15:53,440 --> 06:15:56,470 These are pretty simple phrases, right? 6500 06:15:57,760 --> 06:16:00,220 I mean there's not even\nverbs in these sentences 6501 06:16:00,221 --> 06:16:04,301 they're just nouns and adjectives,\nand very short sentences. 6502 06:16:04,300 --> 06:16:06,460 And so that might be a\nheuristic we could use. 6503 06:16:06,460 --> 06:16:09,070 When analyzing text, well if\nthe words are kind of short 6504 06:16:09,070 --> 06:16:11,501 the sentences are kind of\nshort, everything's very simple 6505 06:16:11,501 --> 06:16:14,510 that's probably a very\nyoung, or early, grade level. 6506 06:16:14,510 --> 06:16:17,925 And so by one formulation, it might\n 6507 06:16:19,931 --> 06:16:22,282 Mr and Mrs. Dursley, of\nnumber 4, Privet Drive 6508 06:16:22,282 --> 06:16:25,240 were proud to say that they were\n 6509 06:16:25,241 --> 06:16:27,221 They were the last\npeople you would expect 6510 06:16:27,221 --> 06:16:29,381 to be involved in anything\nstrange or mysterious 6511 06:16:29,381 --> 06:16:32,111 because they just didn't\nhold with such nonsense. 6512 06:16:33,043 --> 06:16:34,751 All right, what grade\nlevel is this book? 6513 06:16:36,039 --> 06:16:37,331 DAVID MALAN: OK, I heard third. 6514 06:16:38,846 --> 06:16:40,241 DAVID MALAN: Seventh, fifth. 6515 06:16:41,411 --> 06:16:44,800 But grade 7, according to\none particular measure. 6516 06:16:44,800 --> 06:16:49,062 And whether or not we can debate exactly\n 6517 06:16:49,062 --> 06:16:51,520 and maybe you're feeling ahead\nof your time, or behind now. 6518 06:16:51,521 --> 06:16:55,730 But here, we have a snippet of text. 6519 06:16:55,730 --> 06:17:00,820 What makes this text assume an older\n 6520 06:17:00,820 --> 06:17:03,951 a higher grade level, would you think? 6521 06:17:06,675 --> 06:17:09,370 DAVID MALAN: Yeah, it's longer,\ndifferent types of words 6522 06:17:09,370 --> 06:17:11,773 there's commas now in\nphrases, and so forth. 6523 06:17:11,774 --> 06:17:13,941 So there's just some kind\nof sophistication to this. 6524 06:17:13,940 --> 06:17:16,540 So it turns out for the\nupcoming problem set 6525 06:17:16,541 --> 06:17:19,631 among the things you'll do is\ntake, as input, texts like this 6526 06:17:20,771 --> 06:17:23,333 Considering , well, how\nmany words are in the text? 6527 06:17:23,332 --> 06:17:24,790 How many sentences are in the text? 6528 06:17:24,791 --> 06:17:26,636 How many letters are in the text? 6529 06:17:26,635 --> 06:17:30,430 And use those according to a\n 6530 06:17:30,431 --> 06:17:33,940 exactly, the grade level of some\n 6531 06:17:34,843 --> 06:17:37,050 Well what else are we going\nto do in the coming days? 6532 06:17:37,050 --> 06:17:39,670 Well I've alluded to this notion\nof cryptography in the past. 6533 06:17:39,670 --> 06:17:42,610 This notion of scrambling\ninformation in such a way 6534 06:17:42,611 --> 06:17:45,683 that you can hide the\ncontents of a message 6535 06:17:45,683 --> 06:17:47,890 from someone who might\notherwise intercept it, right? 6536 06:17:47,890 --> 06:17:50,390 The earliest form of this might\nalso be when you're younger 6537 06:17:50,390 --> 06:17:53,650 and you're in class, and you're passing\n 6538 06:17:53,651 --> 06:17:54,911 from yourself to someone else. 6539 06:17:54,911 --> 06:17:57,221 You don't want to necessarily\nwrite a note in English 6540 06:17:57,221 --> 06:17:59,381 or some other written,\nlanguage you might want 6541 06:17:59,381 --> 06:18:01,691 to scramble it somehow, or encrypt it. 6542 06:18:01,690 --> 06:18:04,720 Maybe you change the As\nto a B, and the Bs to a C. 6543 06:18:04,721 --> 06:18:07,031 So that if the teacher snaps\nit up and intercepts it 6544 06:18:07,030 --> 06:18:09,460 they can't actually\nunderstand what it is you've 6545 06:18:09,460 --> 06:18:11,420 written because it's encrypted. 6546 06:18:11,420 --> 06:18:13,870 So long as your friend,\nthe recipient of this note 6547 06:18:13,870 --> 06:18:16,150 knows how you manipulated it. 6548 06:18:16,151 --> 06:18:19,901 How you added or subtracted\nletters to each other 6549 06:18:19,901 --> 06:18:23,111 they can decrypt it, which\nis to reverse that process. 6550 06:18:23,111 --> 06:18:26,331 So formally, in the world of\ncryptography and computer science 6551 06:18:26,330 --> 06:18:28,390 this is another problem to solve. 6552 06:18:28,390 --> 06:18:31,434 Your input, though, when you have a\n 6553 06:18:31,434 --> 06:18:33,101 is what's generally known as plain text. 6554 06:18:33,100 --> 06:18:37,240 There's some algorithm that's\ngoing to then encipher, or encrypt 6555 06:18:37,241 --> 06:18:40,361 that information, into what's\ncalled ciphertext, which 6556 06:18:40,361 --> 06:18:42,911 is the scrambled version that\ntheoretically can get safely 6557 06:18:42,911 --> 06:18:45,370 intercepted and your message\nhas not been spoiled 6558 06:18:45,370 --> 06:18:48,880 unless that intercept\nactually knows what algorithm 6559 06:18:48,881 --> 06:18:51,411 you used inside of this process. 6560 06:18:51,411 --> 06:18:53,980 So that would be generally\nknown as a cipher. 6561 06:18:53,980 --> 06:18:57,341 The ciphers typically take,\nthough, not one input, but two. 6562 06:18:57,341 --> 06:19:01,945 If, for instance, your cipher\nis as simple as A becomes B 6563 06:19:01,945 --> 06:19:05,681 B becomes C, C becomes D,\ndot dot dot, Z becomes A 6564 06:19:05,681 --> 06:19:09,401 you're essentially adding one to\nevery letter and encrypting it. 6565 06:19:09,401 --> 06:19:12,011 Now that would be,\nwhat we call, the key. 6566 06:19:12,010 --> 06:19:15,730 You and the recipient both have to\n 6567 06:19:15,730 --> 06:19:19,541 in advance, what number you're\ngoing to use that day to rotate 6568 06:19:19,541 --> 06:19:21,221 or change all of these letters by. 6569 06:19:21,221 --> 06:19:24,671 Because when you add 1, they\nupon receiving your ciphertext 6570 06:19:24,670 --> 06:19:27,350 have to subtract 1 to\nget back the answer. 6571 06:19:27,350 --> 06:19:31,990 For instance, if the input,\nplaintext, is hi, as before 6572 06:19:31,991 --> 06:19:37,271 and the key is 1, the ciphertext using\n 6573 06:19:37,271 --> 06:19:41,980 otherwise known as the Caesar cipher,\n 6574 06:19:41,980 --> 06:19:45,668 So it's similar, but it's at\nleast scrambled at first glance. 6575 06:19:45,669 --> 06:19:47,711 And unless the teacher\nreally cares to figure out 6576 06:19:47,710 --> 06:19:50,681 what algorithm are they using today,\n 6577 06:19:50,681 --> 06:19:53,960 it's probably sufficiently\nsecure for your purposes. 6578 06:19:53,960 --> 06:19:55,420 How do you reverse the process? 6579 06:19:55,420 --> 06:19:58,450 Well, your friend gets this\nand reverses it by negative 1. 6580 06:19:58,451 --> 06:20:02,890 So I becomes H, J becomes I,\nand things like punctuation 6581 06:20:02,890 --> 06:20:05,320 remain untouched at\nleast in this scheme. 6582 06:20:05,320 --> 06:20:07,841 So let's consider one\nfinal example here. 6583 06:20:07,841 --> 06:20:15,341 If the input to the algorithm\nis Uijtxbtdt50, and the key 6584 06:20:17,350 --> 06:20:23,770 Such that now B should become A, and C\n 6585 06:20:23,771 --> 06:20:25,390 So we're going in the other direction. 6586 06:20:27,291 --> 06:20:30,261 Well if we spread all the letters\n 6587 06:20:30,260 --> 06:20:36,040 and we start subtracting one letter,\n 6588 06:20:36,041 --> 06:20:41,480 T becomes S, X becomes W, A, was, D, T-- 6589 06:22:01,705 --> 06:22:06,530 DAVID J. MALAN: This is CS50, and\nthis is already week three. 6590 06:22:06,530 --> 06:22:10,056 And even as we've gotten much more\n 6591 06:22:10,056 --> 06:22:11,931 and some of the C stuff\nthat we've been doing 6592 06:22:11,931 --> 06:22:15,201 is all the more cryptic looking,\n 6593 06:22:15,201 --> 06:22:19,008 like, everything we've been doing\n 6594 06:22:19,008 --> 06:22:20,841 So keep that in mind,\nparticularly as things 6595 06:22:20,841 --> 06:22:23,591 seem like they're getting more\n 6596 06:22:23,591 --> 06:22:26,210 It's just a process of learning\na new language that ultimately 6597 06:22:26,210 --> 06:22:28,791 lets us express this process. 6598 06:22:28,791 --> 06:22:31,851 And of course, last week we really\n 6599 06:22:31,850 --> 06:22:33,600 inputs and outputs are represented. 6600 06:22:33,600 --> 06:22:37,940 And this thing here, a photograph\nthereof, is called what? 6601 06:22:39,989 --> 06:22:41,031 DAVID J. MALAN: RAM, I heard-- 6602 06:22:41,030 --> 06:22:44,030 Random Access Memory or just\ngenerally known as memory. 6603 06:22:44,030 --> 06:22:46,640 And recall that we looked at\none of these little black chips 6604 06:22:46,640 --> 06:22:48,861 that contains all of the bytes-- 6605 06:22:48,861 --> 06:22:50,151 all of the bits, ultimately. 6606 06:22:50,151 --> 06:22:52,761 It's just kind of a grid,\nsort of an artist grid, that 6607 06:22:52,760 --> 06:22:55,880 allows us to think about every\none of these memory locations 6608 06:22:55,881 --> 06:22:58,401 as just having a number or\nan address, so to speak. 6609 06:22:58,401 --> 06:23:01,131 Like, this might be byte\nnumber 0 and then 1 and then 2 6610 06:23:01,131 --> 06:23:04,041 and then, maybe way down here\nagain, something like 2 billion 6611 06:23:04,041 --> 06:23:06,441 if you have 2 gigabytes of memory. 6612 06:23:06,440 --> 06:23:10,431 And so as we did that, we started to\n 6613 06:23:10,431 --> 06:23:14,120 to create kind of our own information,\n 6614 06:23:14,120 --> 06:23:16,740 just the basics like ints\nand floats and so forth. 6615 06:23:16,741 --> 06:23:18,831 But we also talked about strings. 6616 06:23:18,830 --> 06:23:21,440 And what is a string as you now know it? 6617 06:23:21,440 --> 06:23:24,560 How would you describe in\nlayperson's terms a string? 6618 06:23:26,841 --> 06:23:28,258 DAVID J. MALAN: An array of characters. 6619 06:23:28,258 --> 06:23:30,051 And an array, meanwhile--\nlet's go there. 6620 06:23:30,050 --> 06:23:34,940 How might someone else define an\n 6621 06:23:38,030 --> 06:23:41,431 AUDIENCE: Kind of like\nan indexed set of things. 6622 06:23:41,431 --> 06:23:43,421 DAVID J. MALAN: An indexed\nset of things-- not bad. 6623 06:23:43,420 --> 06:23:46,570 And I think a key characteristic to\n 6624 06:23:46,570 --> 06:23:48,251 does actually pertain to memory. 6625 06:23:50,170 --> 06:23:53,560 Byte after byte after byte\nis what constitutes an array. 6626 06:23:53,561 --> 06:23:56,171 And we'll see in a couple of\nweeks time that there's actually 6627 06:23:56,170 --> 06:24:00,130 more interesting ways to use this same\n 6628 06:24:00,131 --> 06:24:03,601 things that are sort of two directional\n 6629 06:24:04,100 --> 06:24:07,330 But for now, all we've talked about\n 6630 06:24:07,330 --> 06:24:11,600 from left to right, top to bottom,\n 6631 06:24:11,600 --> 06:24:14,260 So today, we'll consider still an array. 6632 06:24:14,260 --> 06:24:17,380 But we won't focus so\nmuch on representation 6633 06:24:17,381 --> 06:24:18,820 of strings or other data types. 6634 06:24:18,820 --> 06:24:21,278 We'll actually now focus on\nthe other part of that process 6635 06:24:21,278 --> 06:24:24,310 of inputs becoming outputs,\nnamely the thing in the middle-- 6636 06:24:25,300 --> 06:24:29,560 But we have to keep in mind, even though\n 6637 06:24:29,561 --> 06:24:32,971 thus far, certainly on the board\n 6638 06:24:32,971 --> 06:24:34,721 have the luxury of\njust kind of eyeballing 6639 06:24:34,721 --> 06:24:38,181 the whole thing with a bird's eye view\n 6640 06:24:38,681 --> 06:24:40,513 If I asked you where a\nparticular number is 6641 06:24:40,513 --> 06:24:43,510 like zero, odds are your eyes\nwould go right to where it is 6642 06:24:43,510 --> 06:24:46,510 and boom, problem solved\nin sort of one step. 6643 06:24:46,510 --> 06:24:51,620 But the catch is, with a computer\n 6644 06:24:51,620 --> 06:24:55,091 the human, can [INAUDIBLE] see\n 6645 06:24:55,091 --> 06:24:58,451 It's better to think of your\n 6646 06:24:58,451 --> 06:25:01,361 or more specifically an\narray of memory like this 6647 06:25:01,361 --> 06:25:05,861 as really being a set of closed\n 6648 06:25:05,861 --> 06:25:08,620 And only by opening\neach of those doors can 6649 06:25:08,620 --> 06:25:10,451 the computer actually\nsee what's in there 6650 06:25:10,451 --> 06:25:12,971 which is to say that the\ncomputer, unlike you, doesn't 6651 06:25:12,971 --> 06:25:16,841 have this bird's eye view of all\n 6652 06:25:16,841 --> 06:25:19,760 It has to much more\nmethodically look here 6653 06:25:19,760 --> 06:25:23,950 maybe look here, maybe look here, and\n 6654 06:25:23,951 --> 06:25:28,031 Now fortunately, we already have some\n 6655 06:25:28,030 --> 06:25:29,570 Boolean expressions, and the like-- 6656 06:25:29,570 --> 06:25:31,480 where you could imagine\nwriting some code 6657 06:25:31,480 --> 06:25:36,041 that very methodically goes from left\n 6658 06:25:36,041 --> 06:25:39,851 more sophisticated that actually\n 6659 06:25:39,850 --> 06:25:43,210 And just remember that the\nconventions we've had since last week 6660 06:25:43,210 --> 06:25:47,681 now is that these arrays are\nzero indexed, so to speak. 6661 06:25:47,681 --> 06:25:52,511 To be zero indexed just means that the\n 6662 06:25:52,510 --> 06:25:56,290 So this is location 0, 1, 2, 3, 4, 5, 6. 6663 06:25:56,291 --> 06:25:59,681 And notice even though there\nare seven total doors here 6664 06:25:59,681 --> 06:26:02,291 the right-most one,\nof course, is called 6 6665 06:26:02,291 --> 06:26:04,300 just because we've\nstarted counting at 0. 6666 06:26:04,300 --> 06:26:09,041 So in the general case, if you\nhad n doors or n bytes of memory 6667 06:26:09,041 --> 06:26:13,841 0 would always be at the left, and n\n 6668 06:26:13,841 --> 06:26:18,221 That's sort of a generalization of just\n 6669 06:26:18,221 --> 06:26:22,151 All right, so let's revisit the problem\n 6670 06:26:22,151 --> 06:26:24,884 with in week zero, which was\nthis notion of searching. 6671 06:26:24,883 --> 06:26:26,800 And what does it mean\nto search for something? 6672 06:26:26,800 --> 06:26:29,578 Well, to find information-- and\nthis, of course, is omnipresent. 6673 06:26:29,579 --> 06:26:32,621 Anytime you take out your phone, you're\n 6674 06:26:32,620 --> 06:26:35,501 Any time you pull up a browser,\n 6675 06:26:35,501 --> 06:26:39,670 So search is kind of one of the\n 6676 06:26:41,238 --> 06:26:44,320 So let's consider how the Googles, the\n 6677 06:26:44,320 --> 06:26:48,021 are implementing something as\nseemingly familiar as this. 6678 06:26:48,021 --> 06:26:50,381 So here might be the problem statement. 6679 06:26:50,381 --> 06:26:52,811 We want some input to\nbecome some output. 6680 06:26:52,811 --> 06:26:54,161 What's that input going to be? 6681 06:26:54,161 --> 06:26:57,881 Maybe it's a bunch of closed doors\n 6682 06:26:57,881 --> 06:27:00,611 to get back an answer, true or false. 6683 06:27:00,611 --> 06:27:03,146 Is something we're\nlooking for there or not? 6684 06:27:03,146 --> 06:27:05,771 You can imagine taking this one\nstep further and trying to find 6685 06:27:05,771 --> 06:27:07,881 where is the thing you're looking for. 6686 06:27:07,881 --> 06:27:10,300 But for now, let's just take\none bite out of the problem. 6687 06:27:10,300 --> 06:27:14,920 Can we tell ourselves, true or\nfalse, is some number behind one 6688 06:27:14,920 --> 06:27:17,751 of these doors or lockers in memory? 6689 06:27:17,751 --> 06:27:22,120 But before we go there and start\n 6690 06:27:22,940 --> 06:27:27,040 Let's consider how we might\nlay the foundation of, like 6691 06:27:27,041 --> 06:27:30,469 comparing whether one algorithm\nis better than another. 6692 06:27:30,469 --> 06:27:32,261 We talked about\ncorrectness, and it sort of 6693 06:27:32,260 --> 06:27:35,800 goes without saying that any code you\n 6694 06:27:36,820 --> 06:27:39,701 Otherwise, what's the point if it\n 6695 06:27:39,701 --> 06:27:41,620 But we also talked about design. 6696 06:27:41,620 --> 06:27:45,640 And in your own words, what do we\n 6697 06:27:45,640 --> 06:27:48,830 designed at this stage than another? 6698 06:27:48,830 --> 06:27:52,240 How do you think about\nthis notion of design now? 6699 06:27:53,411 --> 06:27:55,661 AUDIENCE: Easier to understand\nor easier to institute. 6700 06:27:55,661 --> 06:27:57,350 DAVID J. MALAN: OK, so easier to understand. 6701 06:28:00,341 --> 06:28:03,173 DAVID J. MALAN: Efficiency, and what do\n 6702 06:28:07,190 --> 06:28:09,751 It doesn't use up too much\nmemory, and it isn't redundant. 6703 06:28:09,751 --> 06:28:11,480 So you can think about\ndesign along a few 6704 06:28:11,480 --> 06:28:13,438 of these axes-- sort of\nthe quality of the code 6705 06:28:13,438 --> 06:28:15,630 but also the quality of the performance. 6706 06:28:15,631 --> 06:28:20,179 And as our programs get bigger and\n 6707 06:28:20,179 --> 06:28:22,221 those kinds of things are\nreally going to matter. 6708 06:28:22,221 --> 06:28:24,221 And in the real world,\nif you start writing code 6709 06:28:24,221 --> 06:28:26,991 not just by yourself but with\nsomeone else, getting the design 6710 06:28:26,991 --> 06:28:30,501 right is just going to make it easier\n 6711 06:28:30,501 --> 06:28:33,030 write code, with just\nhigher probability. 6712 06:28:33,030 --> 06:28:36,650 So let's consider how we might focus\n 6713 06:28:36,651 --> 06:28:38,751 the efficiency, of an algorithm. 6714 06:28:38,751 --> 06:28:42,591 And the way we might talk about the\n 6715 06:28:42,591 --> 06:28:45,710 or how slow they are, is in\nterms of their running time. 6716 06:28:45,710 --> 06:28:49,260 That is to say, when they're\n 6717 06:28:49,260 --> 06:28:52,370 And we might measure this in\nseconds or milliseconds or minutes 6718 06:28:52,370 --> 06:28:54,710 or just some number of\nsteps in the general case 6719 06:28:54,710 --> 06:28:58,920 because presumably fewer steps, to\n 6720 06:28:58,920 --> 06:29:00,770 So how might we think\nabout running times? 6721 06:29:00,771 --> 06:29:03,831 Well, there's one general\nnotation we should define today. 6722 06:29:03,830 --> 06:29:07,970 So computer scientists tend to describe\n 6723 06:29:07,971 --> 06:29:12,201 or a piece of code, for that matter, in\n 6724 06:29:12,201 --> 06:29:14,991 This is literally a\ncapitalized O, a big O. 6725 06:29:14,991 --> 06:29:18,471 And this generally means that the\nrunning time of some algorithm 6726 06:29:18,471 --> 06:29:21,920 is on the order of such and such,\n 6727 06:29:21,919 --> 06:29:24,529 is just going to be a very\nsimple mathematical formula. 6728 06:29:24,529 --> 06:29:26,810 It's kind of a way of waving\nyour hands mathematically 6729 06:29:26,811 --> 06:29:31,131 to convey the idea of just how fast\n 6730 06:29:31,131 --> 06:29:32,961 without getting into\nthe weeds of like, it 6731 06:29:32,960 --> 06:29:36,690 took this many milliseconds or\n 6732 06:29:36,690 --> 06:29:40,640 So you might recall then from week\n 6733 06:29:41,721 --> 06:29:45,021 At the time, we just use this to\n 6734 06:29:45,021 --> 06:29:47,960 Recall that this red straight\nline was the first algorithm 6735 06:29:49,280 --> 06:29:54,230 The yellow line that's still\n 6736 06:29:54,230 --> 06:29:58,730 That line represented what\nalternative algorithm? 6737 06:29:59,730 --> 06:30:00,980 What is that second algorithm? 6738 06:30:01,791 --> 06:30:03,291 AUDIENCE: Like, two pages at a time. 6739 06:30:03,291 --> 06:30:06,441 DAVID J. MALAN: Two pages at a time, which\n 6740 06:30:06,440 --> 06:30:09,700 potentially double back a page if maybe\n 6741 06:30:10,201 --> 06:30:12,733 So it had a potential bug\nbut arguably solvable. 6742 06:30:12,733 --> 06:30:15,440 This last algorithm, though, was\n 6743 06:30:15,440 --> 06:30:18,950 strategy where I sort of unnecessarily\n 6744 06:30:18,951 --> 06:30:20,960 and then in half and\nthen in half, which 6745 06:30:20,960 --> 06:30:25,251 as dramatic as that was unnecessarily,\n 6746 06:30:25,251 --> 06:30:27,780 bites out of the\nproblem-- like 500 pages 6747 06:30:27,780 --> 06:30:33,720 the first time, another 250, another\n 6748 06:30:33,721 --> 06:30:36,944 And so we described its\nrunning time as this picture 6749 06:30:36,943 --> 06:30:39,861 there, though I didn't use that\n 6750 06:30:39,861 --> 06:30:42,591 But indeed, time to solve\nmight be measured just 6751 06:30:42,591 --> 06:30:44,510 abstractly in some unit of measure-- 6752 06:30:44,510 --> 06:30:47,780 seconds, milliseconds, minutes, pages-- 6753 06:30:49,471 --> 06:30:52,191 So let's now slap some numbers on this. 6754 06:30:52,190 --> 06:30:55,610 If we had n pages in that\nphone book, n just representing 6755 06:30:55,611 --> 06:30:57,980 a generic number, the\nfirst algorithm here 6756 06:30:57,980 --> 06:30:59,811 we might describe as taking n steps. 6757 06:30:59,811 --> 06:31:03,230 Second algorithm we might describe\n 6758 06:31:03,230 --> 06:31:05,870 maybe give or take one if we\nhave to double back but generally 6759 06:31:06,843 --> 06:31:09,050 And then this thing, if you\nremember your logarithms 6760 06:31:09,050 --> 06:31:11,008 was sort of a fundamentally\ndifferent formula-- 6761 06:31:11,008 --> 06:31:14,460 log base 2 of n or just\nlog of n for short. 6762 06:31:14,460 --> 06:31:17,150 So this is of a fundamentally\ndifferent formula. 6763 06:31:17,151 --> 06:31:20,791 But what's noteworthy is that\nthese first two algorithms 6764 06:31:20,791 --> 06:31:24,411 even though, yes, the second\nalgorithm was hands down faster-- 6765 06:31:24,411 --> 06:31:26,271 I mean, literally twice as fast-- 6766 06:31:26,271 --> 06:31:30,921 when you start to zoom out and if\n 6767 06:31:30,920 --> 06:31:36,920 these first two start to look\nawfully similar to one another. 6768 06:31:36,920 --> 06:31:39,170 And if we keep zooming out\nand zooming out and zooming 6769 06:31:39,170 --> 06:31:41,450 out as n gets really large-- 6770 06:31:41,451 --> 06:31:43,581 that is, the x-axis gets really long-- 6771 06:31:43,580 --> 06:31:47,610 these first two algorithms start\nto become essentially the same. 6772 06:31:47,611 --> 06:31:50,751 And so this is where computer\nscientists use big O notation. 6773 06:31:50,751 --> 06:31:54,780 Instead of saying specifically,\nthis algorithm takes any steps. 6774 06:31:54,780 --> 06:31:57,980 And this one n divided by 2, a\ncomputer scientist would say 6775 06:31:57,980 --> 06:32:01,370 eh, each of those algorithms\ntakes on the order of n steps 6776 06:32:01,370 --> 06:32:03,380 or on the order of n over 2. 6777 06:32:04,131 --> 06:32:07,851 On the order of n over 2\nis pretty much the same 6778 06:32:07,850 --> 06:32:13,980 when n gets really large as being\n 6779 06:32:13,980 --> 06:32:18,841 So yes, in practice, it's obviously\n 6780 06:32:18,841 --> 06:32:22,311 But in the big picture, when n\nbecomes a million, a billion 6781 06:32:22,311 --> 06:32:24,980 the numbers are already\nso darn big at that point 6782 06:32:24,980 --> 06:32:28,041 that these are as, the\nshapes of these curves imply 6783 06:32:28,041 --> 06:32:30,291 pretty much functionally equivalent. 6784 06:32:30,291 --> 06:32:33,441 But this one still\nlooks better and better 6785 06:32:33,440 --> 06:32:36,830 as n gets large because it's\nrising so much less quickly. 6786 06:32:36,830 --> 06:32:39,020 And so here, a computer\nscientist would say 6787 06:32:39,021 --> 06:32:43,679 that that third algorithm was on the\n 6788 06:32:43,679 --> 06:32:45,471 And they don't have to\nbother with the base 6789 06:32:45,471 --> 06:32:49,341 because it's a smaller mathematical\n 6790 06:32:49,341 --> 06:32:52,251 a constant, multiplicative factor. 6791 06:32:52,251 --> 06:32:54,201 So in short, what are\nthe takeaways here? 6792 06:32:54,201 --> 06:32:56,631 This is just a new\nvocabulary that we'll start 6793 06:32:56,631 --> 06:33:00,381 to use when we just want to describe\n 6794 06:33:00,381 --> 06:33:02,961 To make this more real, if\nany of you have implemented 6795 06:33:02,960 --> 06:33:08,650 a for loop at this point in any of your\n 6796 06:33:08,651 --> 06:33:11,951 where maybe in was the height\nof your pyramid or maybe n 6797 06:33:11,951 --> 06:33:15,911 was something else that you wanted\nto do n times, you wrote code 6798 06:33:15,911 --> 06:33:20,800 or you implemented an algorithm\n 6799 06:33:21,561 --> 06:33:23,711 So this is just a way now\nto retroactively start 6800 06:33:23,710 --> 06:33:27,850 describing with somewhat\nmathematical notation what we've 6801 06:33:27,850 --> 06:33:30,140 been doing in practice for a while now. 6802 06:33:30,140 --> 06:33:35,861 So here's a list of commonly seen\n 6803 06:33:35,861 --> 06:33:39,370 This is not a thorough list\nbecause you could come up 6804 06:33:39,370 --> 06:33:42,070 with an infinite number of\nmathematical formulas, certainly. 6805 06:33:42,070 --> 06:33:45,760 But the common ones we'll discuss\n 6806 06:33:45,760 --> 06:33:48,430 probably reduce to this list here. 6807 06:33:48,431 --> 06:33:50,681 And if you were to study\nmore computer science theory 6808 06:33:50,681 --> 06:33:52,263 this list would get longer and longer. 6809 06:33:52,263 --> 06:33:56,290 But for now, these are sort of the\n 6810 06:33:56,291 --> 06:33:59,061 All right, two other pieces\nof vocabulary, if you will 6811 06:33:59,061 --> 06:34:00,521 before we start to use this stuff-- 6812 06:34:00,521 --> 06:34:03,791 so this, a big omega,\ncapital omega symbol 6813 06:34:03,791 --> 06:34:09,531 is used now to describe a lower bound\n 6814 06:34:09,530 --> 06:34:12,970 So to be clear, big O is\non the order of-- that 6815 06:34:12,971 --> 06:34:16,271 is, an upper bound-- on how\nmany steps an algorithm might 6816 06:34:16,271 --> 06:34:19,061 take, on the order of so many steps. 6817 06:34:19,061 --> 06:34:21,941 If you want to talk, though,\nfrom the other perspective, well 6818 06:34:21,940 --> 06:34:24,070 how few steps my algorithm take? 6819 06:34:24,070 --> 06:34:26,681 Maybe in the so-called\nbest case, it'd be nice 6820 06:34:26,681 --> 06:34:29,471 if we had a notation to\njust describe what a lower 6821 06:34:29,471 --> 06:34:32,201 bound is because some\nalgorithms might be super fast 6822 06:34:32,201 --> 06:34:34,251 in these so-called best cases. 6823 06:34:34,251 --> 06:34:38,021 So the symbology is almost the\nsame, but we replace the big O 6824 06:34:39,140 --> 06:34:42,940 So to be clear, big O describes\nan upper bound and omega 6825 06:34:44,471 --> 06:34:46,901 And we'll see examples\nof this before long. 6826 06:34:46,901 --> 06:34:52,781 And then lastly, last one here, big\n 6827 06:34:52,780 --> 06:34:57,220 when you have a case where both\n 6828 06:34:57,221 --> 06:35:00,251 running time is the\nsame as the lower bound. 6829 06:35:00,251 --> 06:35:03,940 You can then describe it in one breath\n 6830 06:35:03,940 --> 06:35:08,021 instead of saying it's in big O\nand in omega of something else. 6831 06:35:08,021 --> 06:35:12,280 All right, so out of context, sort\n 6832 06:35:12,280 --> 06:35:15,824 but all they refer to is upper\nbounds, lower bounds, or when 6833 06:35:15,824 --> 06:35:17,241 they happen to be one in the same. 6834 06:35:17,241 --> 06:35:20,801 And we'll now introduce over time\n 6835 06:35:20,800 --> 06:35:23,350 apply these to concrete problems. 6836 06:35:23,350 --> 06:35:27,580 But first, let me pause to\nsee if there's any questions. 6837 06:35:43,920 --> 06:35:45,880 DAVID J. MALAN: Smaller n\nfunctions move faster. 6838 06:35:45,881 --> 06:35:50,621 So yes, if you have something\nlike n, that takes only steps. 6839 06:35:50,620 --> 06:35:53,626 If you have a formula like n\n 6840 06:35:53,626 --> 06:35:56,260 that take more steps\nand therefore be slower. 6841 06:35:56,260 --> 06:35:58,170 So the larger the\nmathematical expression 6842 06:35:58,170 --> 06:36:02,640 the slower your algorithm is because the\n 6843 06:36:05,161 --> 06:36:07,530 AUDIENCE: So you want your\nn function to be small? 6844 06:36:07,530 --> 06:36:10,771 DAVID J. MALAN: You want your n function,\n 6845 06:36:10,771 --> 06:36:12,581 And in fact, the Holy\nGrail, so to speak 6846 06:36:12,580 --> 06:36:16,050 would be this last one here either\n 6847 06:36:16,050 --> 06:36:19,260 when an algorithm is on\nthe order of a single step. 6848 06:36:19,260 --> 06:36:23,730 That means it literally takes constant\n 6849 06:36:23,730 --> 06:36:26,850 100 steps, but a fixed,\nconstant number of steps. 6850 06:36:26,850 --> 06:36:30,480 That's the best because even\nas the phone book gets bigger 6851 06:36:30,480 --> 06:36:34,710 even as the data set you're\nsearching gets larger and larger 6852 06:36:34,710 --> 06:36:37,861 if something only takes a finite\nnumber of steps constantly 6853 06:36:37,861 --> 06:36:42,001 then it doesn't matter how big\nthe data set actually gets. 6854 06:36:42,001 --> 06:36:46,201 Questions as well on these notations--\n 6855 06:36:46,201 --> 06:36:47,671 This is actually very helpful. 6856 06:36:47,670 --> 06:36:49,530 I'm seeing pointing this way? 6857 06:36:52,587 --> 06:36:54,920 DAVID J. MALAN: What is the input\nto each of these functions? 6858 06:36:54,920 --> 06:36:58,070 It is an expression of how\nmany steps an algorithm takes. 6859 06:36:58,070 --> 06:37:00,440 So in fact, let me go\nahead and make this 6860 06:37:00,440 --> 06:37:03,580 more concrete with an actual\nexample here if we could. 6861 06:37:03,580 --> 06:37:06,710 So on stage here, we have\nseven lockers which represent 6862 06:37:06,710 --> 06:37:08,661 if you will, an array of memory. 6863 06:37:08,661 --> 06:37:10,760 And this array of\nmemory is maybe storing 6864 06:37:10,760 --> 06:37:14,510 seven integers, seven integers that\n 6865 06:37:14,510 --> 06:37:17,450 And if we want to search\nfor these values, how might 6866 06:37:18,408 --> 06:37:20,617 Well, for this, why don't\nwe make things interesting? 6867 06:37:20,617 --> 06:37:22,170 Would a volunteer like to come on up? 6868 06:37:22,170 --> 06:37:25,280 Have to be masked and on the\ninternet if you are comfortable. 6869 06:37:25,280 --> 06:37:28,550 Both of-- oh, there's someone putting\n 6870 06:37:34,741 --> 06:37:37,441 And in just a moment,\nour brave volunteer 6871 06:37:37,440 --> 06:37:41,431 is going to help me find a\nspecific number in the data set 6872 06:37:41,431 --> 06:37:42,881 that we have here on the screen. 6873 06:37:42,881 --> 06:37:46,591 So come on down, and I'll get things\n 6874 06:37:55,890 --> 06:37:57,473 DAVID J. MALAN: [? Nomira. ?] Nice to meet. 6875 06:37:58,061 --> 06:38:02,011 So here we have for Nomira seven\nlockers or an array of memory. 6876 06:38:02,010 --> 06:38:03,900 And behind each of\nthese doors is a number. 6877 06:38:03,901 --> 06:38:06,930 And the goal, quite simply,\nis, given this array of memory 6878 06:38:06,930 --> 06:38:12,041 as input, to return, true or false, is\n 6879 06:38:12,041 --> 06:38:14,611 So suppose I care about the number 0. 6880 06:38:14,611 --> 06:38:18,091 What would be the simplest,\nmost correct algorithm you could 6881 06:38:18,091 --> 06:38:22,561 apply in order to find us the number 0? 6882 06:38:22,561 --> 06:38:25,969 OK, try opening the first one. 6883 06:38:25,969 --> 06:38:28,511 All right, and maybe just step\naside so the audience can see. 6884 06:38:28,510 --> 06:38:30,520 I think you have not found 0 yet. 6885 06:38:31,841 --> 06:38:34,480 Let's move on to your next choice. 6886 06:38:36,359 --> 06:38:37,901 DAVID J. MALAN: Oh, go ahead, second door. 6887 06:38:38,776 --> 06:38:41,918 Let's just move from left to\nright, sort of searching our way. 6888 06:38:48,561 --> 06:38:51,591 All right, also not working\nout so well yet, but that's OK. 6889 06:38:51,591 --> 06:38:55,971 If you want to go on to the\nnext, we're still looking for 0. 6890 06:38:57,291 --> 06:38:58,863 All right, it's not so good yet. 6891 06:39:04,460 --> 06:39:10,230 No, that's a-- all\nright, very well done. 6892 06:39:13,390 --> 06:39:16,690 All right, so I kind of set you\nup for a fairly slow algorithm 6893 06:39:16,690 --> 06:39:18,911 but let me just ask you\nto describe what is it 6894 06:39:18,911 --> 06:39:21,895 you did by following\nthe steps I gave you. 6895 06:39:21,895 --> 06:39:24,021 AUDIENCE: I just went one\nby one to each character. 6896 06:39:24,021 --> 06:39:26,021 DAVID J. MALAN: You went one\nby one to each character 6897 06:39:26,021 --> 06:39:27,741 if you want to talk into here. 6898 06:39:27,741 --> 06:39:29,501 So you went one by\none by each character. 6899 06:39:29,501 --> 06:39:32,980 And would you say that algorithm\nleft or right is correct? 6900 06:39:35,320 --> 06:39:37,361 AUDIENCE: Or, yes, in the scenario. 6901 06:39:37,361 --> 06:39:38,861 DAVID J. MALAN: OK, yes in this scenario. 6902 06:39:39,881 --> 06:39:40,421 What's going through your mind? 6903 06:39:40,420 --> 06:39:42,460 AUDIENCE: Because it's not the\nmost efficient way to do it. 6904 06:39:43,341 --> 06:39:45,971 So we see a contrast here\nbetween correctness and design. 6905 06:39:45,971 --> 06:39:48,721 I mean, I do think it was correct\n 6906 06:39:50,320 --> 06:39:52,850 But it took some number of steps. 6907 06:39:52,850 --> 06:39:54,640 So in fact, this would be an algorithm. 6908 06:39:54,640 --> 06:39:56,710 It has a name, called linear search. 6909 06:39:56,710 --> 06:39:59,201 And, [? Nomira, ?] as you\ndid, you kind of walked along 6910 06:39:59,201 --> 06:40:01,096 a line going from left to right. 6911 06:40:01,721 --> 06:40:04,241 If you had gone from right\nto left, would the algorithm 6912 06:40:04,241 --> 06:40:07,391 have been fundamentally better? 6913 06:40:08,140 --> 06:40:09,245 DAVID J. MALAN: OK, and why? 6914 06:40:09,245 --> 06:40:11,620 AUDIENCE: Because the zero is\nhere in the first scenario. 6915 06:40:11,620 --> 06:40:15,220 But if it was like, the zero is in\n 6916 06:40:15,221 --> 06:40:19,381 DAVID J. MALAN: Yeah, and so here is\n 6917 06:40:19,381 --> 06:40:20,631 becomes a little less obvious. 6918 06:40:20,631 --> 06:40:23,068 You would absolutely have\ngiven yourself a better result 6919 06:40:23,067 --> 06:40:25,150 if you would just happened\nto start from the right 6920 06:40:25,151 --> 06:40:27,141 or if I had pointed you\nto start over there. 6921 06:40:27,140 --> 06:40:30,268 But the catch is if I asked her to\n 6922 06:40:30,268 --> 06:40:31,600 well, that would have backfired. 6923 06:40:31,600 --> 06:40:33,308 And this time, it\nwould have taken longer 6924 06:40:33,309 --> 06:40:35,811 to find that number because\nit's way over here instead. 6925 06:40:35,811 --> 06:40:40,331 And so in the general case, going\n 6926 06:40:40,330 --> 06:40:43,900 is probably as correct as you can\n 6927 06:40:43,901 --> 06:40:47,878 about the order of these numbers-- and\n 6928 06:40:47,878 --> 06:40:49,960 Some of them are smaller,\nsome of them are bigger. 6929 06:40:49,960 --> 06:40:51,668 There doesn't seem to\nbe rhyme or reason. 6930 06:40:51,669 --> 06:40:55,691 Linear search is about as good as you\n 6931 06:40:55,690 --> 06:40:57,414 a priori about the numbers. 6932 06:40:57,414 --> 06:41:00,081 So I have a little thank you gift\nhere, a little CS stress ball. 6933 06:41:00,080 --> 06:41:03,040 Round of applause for\nour first volunteer. 6934 06:41:08,030 --> 06:41:12,050 Let's try to formalize what I\njust described as linear search 6935 06:41:12,050 --> 06:41:15,298 because indeed, no matter which\nend [? Nomira ?] had started on 6936 06:41:15,298 --> 06:41:17,091 I could have kind of\nchanged up the problem 6937 06:41:17,091 --> 06:41:19,431 to make sure that it\nappears to be running slow. 6938 06:41:20,451 --> 06:41:23,631 If zero were among those doors,\n 6939 06:41:24,620 --> 06:41:30,230 So let's now try to translate what\n 6940 06:41:30,230 --> 06:41:32,570 pseudo code as from week zero. 6941 06:41:32,570 --> 06:41:35,091 So with pseudo code, we\njust need a terse English 6942 06:41:35,091 --> 06:41:38,131 like, or any language, syntax\nto describe what we did. 6943 06:41:38,131 --> 06:41:40,640 So here might be one formulation\nof what [? Nomira ?] did. 6944 06:41:40,640 --> 06:41:45,050 For each door, from left to right,\n 6945 06:41:46,640 --> 06:41:51,990 Else, at the very end of the program,\n 6946 06:41:53,161 --> 06:41:55,161 And by the seventh door,\n[? Nomira ?] had indeed 6947 06:41:55,161 --> 06:41:58,791 returned true by saying,\nwell, there is the zero. 6948 06:41:58,791 --> 06:42:01,431 But let's consider if this\npseudo code is now correct 6949 06:42:02,510 --> 06:42:06,720 First of all, normally, when we've\n 6950 06:42:06,721 --> 06:42:10,461 And yet down here, return\nfalse is aligned with the for. 6951 06:42:10,460 --> 06:42:14,060 Why did I not indent the return\nfalse, or put another way 6952 06:42:14,061 --> 06:42:21,141 why did I not do if number is behind\n 6953 06:42:21,140 --> 06:42:24,216 Why would that version of this\ncode have been problematic? 6954 06:42:34,681 --> 06:42:37,050 DAVID J. MALAN: OK, I'm not sure\nit's because of redundancy. 6955 06:42:37,050 --> 06:42:39,361 Let me go ahead and\njust make this explicit. 6956 06:42:39,361 --> 06:42:42,870 If I had instead done\nelse return false, I 6957 06:42:42,870 --> 06:42:47,302 don't think it's so much redundancy\nthat I'd be worried about. 6958 06:42:47,302 --> 06:42:48,510 Let me bounce somewhere else. 6959 06:42:49,170 --> 06:42:52,710 AUDIENCE: Um, maybe\n[INAUDIBLE] for the entire list 6960 06:42:52,710 --> 06:42:54,091 after just checking one number. 6961 06:42:54,091 --> 06:42:56,280 DAVID J. MALAN: Yeah, it would\nbe returning falls for-- 6962 06:42:56,280 --> 06:42:58,155 even though I'd only\nlooked at-- [? Nomira ?] 6963 06:42:58,155 --> 06:42:59,460 had only looked at one element. 6964 06:42:59,460 --> 06:43:02,503 And it would have been as though if\n 6965 06:43:02,503 --> 06:43:05,971 she opens this up and says, nope,\n 6966 06:43:05,971 --> 06:43:09,069 That would give me an incorrect\nresult because obviously 6967 06:43:09,068 --> 06:43:11,611 at that stage in the algorithm,\nshe wouldn't have even looked 6968 06:43:11,611 --> 06:43:13,081 through any of the other doors. 6969 06:43:13,080 --> 06:43:16,870 So just the original indentation\nof this, if you will 6970 06:43:16,870 --> 06:43:19,560 without the [? else, ?]\nis correct because only 6971 06:43:19,561 --> 06:43:23,311 if I get to the bottom of this\nalgorithm or the pseudo code does 6972 06:43:23,311 --> 06:43:26,101 it make sense to conclude\nat that point, once she's 6973 06:43:26,100 --> 06:43:29,580 gone through all of the doors,\nthat nope, there's in fact-- 6974 06:43:29,580 --> 06:43:32,911 the number I'm looking for is,\nin fact, not actually there. 6975 06:43:32,911 --> 06:43:37,151 So how might we consider now the\nrunning time of this algorithm? 6976 06:43:37,151 --> 06:43:40,291 We have a few different\ntypes of vocabulary now. 6977 06:43:40,291 --> 06:43:43,441 And if we consider now how\nwe might think about this 6978 06:43:43,440 --> 06:43:46,980 let's start to translate it from\n 6979 06:43:46,980 --> 06:43:48,780 to something a little lower level. 6980 06:43:48,780 --> 06:43:52,181 We've been writing code using\nn and loops and the like. 6981 06:43:52,181 --> 06:43:56,701 So let's take this higher level\npseudo code and now just kind of 6982 06:43:56,701 --> 06:43:59,251 get a middle ground\nbetween English and C. 6983 06:43:59,251 --> 06:44:03,271 Let me propose that we think about\n 6984 06:44:03,271 --> 06:44:05,041 as being a little more pedantic. 6985 06:44:05,041 --> 06:44:13,181 For i from 0 to n minus 1, if number\n 6986 06:44:13,181 --> 06:44:15,881 Otherwise, at the end of\nthe program, return false. 6987 06:44:15,881 --> 06:44:17,881 Now I'm kind of mixing\nEnglish and C here 6988 06:44:17,881 --> 06:44:20,311 but that's reasonable if the\nreader is familiar with C 6989 06:44:21,901 --> 06:44:23,891 And notice this pattern here. 6990 06:44:23,890 --> 06:44:29,010 This is a way of just saying in pseudo\n 6991 06:44:29,010 --> 06:44:33,720 Start at 0 and then just\ncount up to n minus 1. 6992 06:44:33,721 --> 06:44:37,771 And recall n minus 1 is not one\nshy of the end of the array. 6993 06:44:37,771 --> 06:44:40,530 N minus 1 is the end of\nthe array because again, we 6994 06:44:41,940 --> 06:44:45,240 So this is a very common way\nof expressing this kind of loop 6995 06:44:45,241 --> 06:44:48,271 from the left all the way\nto the right of an array. 6996 06:44:48,271 --> 06:44:51,570 Doors I'm kind of implicitly\ntreating as the name of this array 6997 06:44:51,570 --> 06:44:54,361 like it's a variable from last\nweek that I defined as being 6998 06:44:54,361 --> 06:44:56,140 an array of integers in this case. 6999 06:44:56,140 --> 06:45:01,050 So doors bracket i means that\nwhen i is 0, it's this location. 7000 06:45:02,580 --> 06:45:06,150 When i is 7 or, more generally n minus-- 7001 06:45:06,151 --> 06:45:10,631 sorry, 6 or, more generally, n\n 7002 06:45:10,631 --> 06:45:13,061 So same idea but a translation of it. 7003 06:45:13,061 --> 06:45:17,281 So now let's consider what the\n 7004 06:45:17,280 --> 06:45:20,370 If we have this menu of possible\nanswers to this question 7005 06:45:20,370 --> 06:45:23,290 how efficient or inefficient\nis this algorithm 7006 06:45:23,291 --> 06:45:26,011 let's take a look in the\ncontext of this pseudo code. 7007 06:45:26,010 --> 06:45:28,860 We don't even have to bother\ngoing all the way to C. 7008 06:45:28,861 --> 06:45:32,081 How do we go about analyzing\neach of these steps? 7009 06:45:33,960 --> 06:45:39,900 This outermost loop here for i from\n 7010 06:45:39,901 --> 06:45:42,151 is going to execute how many times? 7011 06:45:42,151 --> 06:45:45,781 How many times will that loop execute? 7012 06:45:45,780 --> 06:45:48,900 Let me give folks this\nmoment to think on it. 7013 06:45:48,901 --> 06:45:51,691 How many times is that\ngoing to loop here? 7014 06:45:54,721 --> 06:45:55,890 DAVID J. MALAN: n times, right? 7015 06:45:55,890 --> 06:45:58,080 Because it's from 0 to n minus 1. 7016 06:45:58,080 --> 06:46:00,690 And if it's a little weird to\nthink in from 0 to n minus 1 7017 06:46:00,690 --> 06:46:04,192 this is essentially the same\nmathematically as from 1 to n. 7018 06:46:04,192 --> 06:46:06,150 And that's perhaps a\nlittle more obviously more 7019 06:46:08,050 --> 06:46:12,541 So I might just make a note to myself\n 7020 06:46:12,541 --> 06:46:14,131 What about these inner steps? 7021 06:46:14,131 --> 06:46:17,521 Well, how many steps or seconds\ndoes it take to ask a question? 7022 06:46:19,651 --> 06:46:23,101 if the number you're looking\nfor is behind doors bracket i 7023 06:46:23,100 --> 06:46:25,931 well, as [? Nomira ?] did,\nthat's kind of like one step. 7024 06:46:25,931 --> 06:46:27,181 So you open the door and boom. 7025 06:46:27,181 --> 06:46:30,460 All right, maybe it's two steps,\n 7026 06:46:30,460 --> 06:46:32,681 So this is some constant\nnumber of steps. 7027 06:46:32,681 --> 06:46:34,440 Let's just call it one for simplicity. 7028 06:46:34,440 --> 06:46:37,860 How many steps or seconds\ndoes it take to return true? 7029 06:46:37,861 --> 06:46:40,224 I don't know exactly in\nthe computer's memory 7030 06:46:40,223 --> 06:46:41,640 but that feels like a single step. 7031 06:46:43,300 --> 06:46:46,320 So if this takes one\nstep, this takes one step 7032 06:46:46,320 --> 06:46:49,320 but only if the condition\nis true, it looks 7033 06:46:49,320 --> 06:46:53,730 like you're doing a constant\nnumber of things n times. 7034 06:46:53,730 --> 06:46:56,890 Or maybe you're doing\none additional step. 7035 06:46:56,890 --> 06:46:59,370 So in short, the only thing\nthat really matters here 7036 06:46:59,370 --> 06:47:02,580 in terms of the efficiency or\ninefficiency of the algorithm 7037 06:47:02,580 --> 06:47:05,855 is what are you doing again and again\n 7038 06:47:05,855 --> 06:47:07,230 the thing that's going to add up. 7039 06:47:07,230 --> 06:47:10,440 Doing one thing or two things\na constant number of times? 7040 06:47:11,370 --> 06:47:16,411 But looping, that's going to add up over\n 7041 06:47:16,411 --> 06:47:19,780 the bigger n is going to be and the\n 7042 06:47:19,780 --> 06:47:22,681 which is all to say if you\nwere to describe roughly 7043 06:47:22,681 --> 06:47:27,480 how many steps does this\nalgorithm take in big O notation 7044 06:47:27,480 --> 06:47:30,420 what might your instincts say? 7045 06:47:30,420 --> 06:47:35,710 How many steps is this algorithm on the\n 7046 06:47:40,471 --> 06:47:42,581 And indeed, that's going\nto be the case here. 7047 06:47:42,811 --> 06:47:44,894 Because you're essentially,\nat the end of the day 7048 06:47:44,894 --> 06:47:48,352 doing n things as an upper\nbound on running time. 7049 06:47:48,352 --> 06:47:50,310 And that's, in fact, what\nexactly what happened 7050 06:47:50,311 --> 06:47:52,861 with [? Nomira. ?] She had\nto look at all n lockers 7051 06:47:52,861 --> 06:47:55,811 before finally getting\nto the right answer. 7052 06:47:55,811 --> 06:47:58,831 But what if she got\nlucky and the number we 7053 06:47:58,830 --> 06:48:01,740 were looking for was not\nat the end of the array 7054 06:48:01,741 --> 06:48:04,441 but was at the beginning of the array? 7055 06:48:04,440 --> 06:48:06,010 How might we think about that? 7056 06:48:06,010 --> 06:48:09,480 Well, have a nomenclature for this\n 7057 06:48:09,480 --> 06:48:12,091 Remember, omega notation\nis a lower bound. 7058 06:48:12,091 --> 06:48:18,600 So given this menu of possible running\n 7059 06:48:18,600 --> 06:48:23,257 what might the omega notation be\n 7060 06:48:24,631 --> 06:48:26,611 DAVID J. MALAN: Omega of 1, and why that? 7061 06:48:28,333 --> 06:48:30,751 DAVID J. MALAN: Right, because if\njust by chance she gets lucky 7062 06:48:30,751 --> 06:48:33,661 and the number she's looking\nfor is right there where 7063 06:48:33,661 --> 06:48:35,850 she begins the algorithm, that's it. 7064 06:48:36,901 --> 06:48:39,570 Maybe it's two steps if you have\nto unlock the door and open it 7065 06:48:39,570 --> 06:48:41,100 but it's a constant number of steps. 7066 06:48:41,100 --> 06:48:43,170 And the way we describe\nconstant number of steps 7067 06:48:43,170 --> 06:48:45,390 is just with a single number like 1. 7068 06:48:45,390 --> 06:48:49,350 So the omega notation for linear\nsearch might be omega of 1 7069 06:48:49,350 --> 06:48:53,010 because in the best case, she might just\n 7070 06:48:53,010 --> 06:48:56,220 But in the worst case, we need to\n 7071 06:48:58,350 --> 06:49:01,050 So again there's this way\nnow of talking symbolically 7072 06:49:01,050 --> 06:49:06,510 about best cases and worst cases\n 7073 06:49:06,510 --> 06:49:09,240 Theta notation, just\nas a little trivia now 7074 06:49:09,241 --> 06:49:12,326 is it applicable based on the\ndefinition I gave earlier? 7075 06:49:13,201 --> 06:49:15,991 DAVID J. MALAN: OK, no, because you\n 7076 06:49:15,991 --> 06:49:18,481 when those two bounds,\nupper and lower, happen 7077 06:49:18,480 --> 06:49:21,100 to be the same for shorthand\nnotation, if you will. 7078 06:49:21,100 --> 06:49:25,890 So it suffices here to talk about\njust big O and omega notation. 7079 06:49:25,890 --> 06:49:28,240 Well, what if we are a\nlittle smarter about this? 7080 06:49:28,241 --> 06:49:31,711 Let me go ahead and sort\nof semi-secretly here 7081 06:49:32,741 --> 06:49:34,966 But first, how about\none other volunteer? 7082 06:49:34,966 --> 06:49:37,591 One other volunteer-- you have\nto be comfortable with your mask 7083 06:49:37,591 --> 06:49:39,870 and your being on the internet. 7084 06:49:42,541 --> 06:49:44,241 Yes, you want to come on down? 7085 06:49:45,241 --> 06:49:48,001 And don't look at what I'm\ndoing because I'm going to-- 7086 06:49:52,701 --> 06:49:55,191 take your time and\ndon't look up this way 7087 06:49:55,190 --> 06:49:58,911 because I need a moment to\nrearrange all of the numbers. 7088 06:49:58,911 --> 06:50:01,791 And actually, if you could stay\nright there before coming up 7089 06:50:01,791 --> 06:50:05,271 just an awkward few seconds\nwhile I finish hiding the numbers 7090 06:50:08,251 --> 06:50:10,850 DAVID J. MALAN: I will be right with you. 7091 06:50:10,850 --> 06:50:15,470 Actually, if-- do you want to\nwarm up the crowd for a moment 7092 06:50:16,644 --> 06:50:18,061 So you want to introduce yourself? 7093 06:50:27,861 --> 06:50:30,561 DAVID J. MALAN: All right,\nI think I am ready. 7094 06:50:30,561 --> 06:50:32,061 Thank you for stalling there. 7095 06:50:33,330 --> 06:50:34,070 DAVID J. MALAN: And I didn't catch your name. 7096 06:50:35,811 --> 06:50:36,801 AUDIENCE: Rave, like a party. 7097 06:50:39,291 --> 06:50:41,121 So Rave has kindly volunteered now. 7098 06:50:41,120 --> 06:50:43,085 And I'm going to give you an\nadditional advantage this time. 7099 06:50:43,760 --> 06:50:47,540 DAVID J. MALAN: Unbeknownst to you, I\n 7100 06:50:48,890 --> 06:50:50,480 So they're not in the same\nrandom order like they 7101 06:50:50,480 --> 06:50:52,523 were for [? Nomira. ?]\nYou now have the advantage 7102 06:50:52,523 --> 06:50:55,039 to know that the numbers are\nsorted from small to big. 7103 06:50:55,580 --> 06:50:59,540 DAVID J. MALAN: Given that, and given perhaps\n 7104 06:50:59,541 --> 06:51:03,501 with the phone book, where might you\n 7105 06:51:06,169 --> 06:51:07,961 DAVID J. MALAN: Let's find\nnumber six this time. 7106 06:51:07,960 --> 06:51:09,300 Let's make things interesting. 7107 06:51:11,661 --> 06:51:12,411 DAVID J. MALAN: OK, so the middle. 7108 06:51:14,030 --> 06:51:14,780 DAVID J. MALAN: --that would be right here. 7109 06:51:16,820 --> 06:51:18,681 And you find, sadly, the number five. 7110 06:51:22,742 --> 06:51:24,825 DAVID J. MALAN: All right, and\njust to keep it uniform 7111 06:51:24,826 --> 06:51:27,441 just like I did, I opened to the\nright half of the phone book. 7112 06:51:27,710 --> 06:51:29,240 DAVID J. MALAN: Let's keep it similar. 7113 06:51:30,583 --> 06:51:32,542 DAVID J. MALAN: All right,\nand, uh, a little too far 7114 06:51:32,543 --> 06:51:34,431 even though I know you\nwanted to go one over. 7115 06:51:34,431 --> 06:51:35,440 AUDIENCE: All good, all good. 7116 06:51:35,440 --> 06:51:37,161 DAVID J. MALAN: And now we're\ngoing to go which direction? 7117 06:51:37,161 --> 06:51:38,578 AUDIENCE: Over here in the middle. 7118 06:51:38,578 --> 06:51:40,681 DAVID J. MALAN: Right, and\nvoila, the number six. 7119 06:51:40,681 --> 06:51:42,201 All right, so very nicely done. 7120 06:51:44,980 --> 06:51:46,681 A little stressful for you as well. 7121 06:51:47,570 --> 06:51:50,151 So here we see by nature\nof the locker door 7122 06:51:50,151 --> 06:51:54,711 still being open sort of an\nartifact of the greater efficiency 7123 06:51:54,710 --> 06:51:57,920 it would seem, of this\nalgorithm because now that Rave 7124 06:51:57,920 --> 06:52:00,830 was given the assumption that\n 7125 06:52:00,830 --> 06:52:04,670 on the left to large on the right,\n 7126 06:52:04,670 --> 06:52:07,970 and conquer algorithm from week zero\n 7127 06:52:09,561 --> 06:52:13,011 And simply by starting in\nthe middle and realizing 7128 06:52:13,010 --> 06:52:17,030 OK, too small, then by going to\n 7129 06:52:17,030 --> 06:52:20,181 went a little too far, then by\ngoing to the left half, which 7130 06:52:20,181 --> 06:52:24,021 Rave able to find in just\nthree steps instead of seven 7131 06:52:24,021 --> 06:52:28,081 the number six in this case that\nwe were actually searching for. 7132 06:52:28,080 --> 06:52:32,250 So you can see that this would\nseem to be more efficient. 7133 06:52:32,251 --> 06:52:35,061 Let's consider for just\na moment is it correct. 7134 06:52:35,061 --> 06:52:40,611 If I had used different numbers but\n 7135 06:52:40,611 --> 06:52:43,739 would it still have\nworked this algorithm? 7136 06:52:45,530 --> 06:52:48,280 Like, why would it still\nhave worked, do you think? 7137 06:52:51,061 --> 06:52:52,811 DAVID J. MALAN: Yeah, so\nso long as the numbers 7138 06:52:52,811 --> 06:52:55,121 are always in the same\norder from left to right 7139 06:52:55,120 --> 06:52:58,330 or, heck, they could even be in reverse\n 7140 06:52:58,330 --> 06:53:02,770 the decisions that Rave was making--\n 7141 06:53:02,771 --> 06:53:05,181 would guide us to the\nsolution no matter what. 7142 06:53:05,181 --> 06:53:07,820 And it would seem to take fewer steps. 7143 06:53:07,820 --> 06:53:10,580 So if we consider now the\npseudo code for this algorithm 7144 06:53:10,580 --> 06:53:12,890 let's take a look how we\nmight describe binary search. 7145 06:53:12,890 --> 06:53:15,760 So binary search we might\ndescribe with something like this. 7146 06:53:15,760 --> 06:53:19,001 If the number is behind the middle\n 7147 06:53:19,001 --> 06:53:21,070 then we can just return true. 7148 06:53:21,070 --> 06:53:24,651 Else if the number is\nless than the middle door 7149 06:53:24,651 --> 06:53:27,161 so if six is less than whatever\nis behind the middle door 7150 06:53:27,161 --> 06:53:29,501 then Rave would have\nsearched the left half. 7151 06:53:29,501 --> 06:53:32,050 Else if the number is\ngreater than the middle door 7152 06:53:32,050 --> 06:53:34,060 Rave would have searched the right half. 7153 06:53:34,061 --> 06:53:38,201 Else, if there are no doors-- and\n 7154 06:53:38,201 --> 06:53:40,070 this up top just to keep things clean. 7155 06:53:40,070 --> 06:53:43,751 If there's no doors, what should Rave\n 7156 06:53:43,751 --> 06:53:47,230 if I gave her no lockers to work with? 7157 06:53:48,280 --> 06:53:50,380 But this is an important\ncase to consider 7158 06:53:50,381 --> 06:53:54,341 because if in the process of\nsearching by locker by locker 7159 06:53:54,341 --> 06:53:58,991 we might have whittled down the\n 7160 06:53:58,991 --> 06:54:01,731 to one door to zero\ndoors-- and at that point 7161 06:54:01,730 --> 06:54:03,591 we might have had no\ndoors left to search. 7162 06:54:03,591 --> 06:54:06,401 So we have to naturally have a\nscenario for just considering 7163 06:54:07,480 --> 06:54:11,271 So it's not to say that maybe I don't\n 7164 06:54:11,271 --> 06:54:13,271 But as she divides and\ndivides and divides 7165 06:54:13,271 --> 06:54:17,081 if she runs out of lockers to ask those\n 7166 06:54:17,080 --> 06:54:20,020 if I ran out of phone book\npages to tear in half 7167 06:54:20,021 --> 06:54:23,570 I too might have had to\nreturn false as in this case. 7168 06:54:23,570 --> 06:54:26,861 So how can we now describe\nthis a little more like C 7169 06:54:26,861 --> 06:54:30,070 just to give ourselves a variable\n 7170 06:54:30,070 --> 06:54:33,291 Well, I might talk about\ndoors as being an array. 7171 06:54:33,291 --> 06:54:36,851 And so if I want to express the middle\n 7172 06:54:38,830 --> 06:54:40,630 I'm assuming that\nsomeone has done the math 7173 06:54:40,631 --> 06:54:43,811 to figure out what the middle door\n 7174 06:54:43,811 --> 06:54:46,301 And then doors, if the\nnumber we're looking for 7175 06:54:46,300 --> 06:54:49,830 is less than doors bracket\nmiddle, then search door 7176 06:54:49,830 --> 06:54:53,590 zero through doors middle minus 1. 7177 06:54:53,591 --> 06:54:57,971 So again, this is a more pedantic way of\n 7178 06:54:57,971 --> 06:55:00,520 search the left half,\nsearch the right half-- 7179 06:55:00,520 --> 06:55:07,151 but start to now describe it in\n 7180 06:55:07,151 --> 06:55:08,954 like we did with our array notation. 7181 06:55:08,954 --> 06:55:10,871 The last scenario, of\ncourse, is if the number 7182 06:55:10,870 --> 06:55:13,120 is greater than the\ndoor's bracket middle 7183 06:55:13,120 --> 06:55:16,420 then Rave would have wanted to\nsearch the middle door plus 1-- 7184 06:55:16,420 --> 06:55:21,610 so 1 over-- through doors n minus 1-- 7185 06:55:22,690 --> 06:55:25,750 So again, just a way of sort of\n 7186 06:55:27,349 --> 06:55:31,180 So how might we translate\nthis now into big O notation? 7187 06:55:31,181 --> 06:55:38,230 Well, in the worst case, how many\n 7188 06:55:39,431 --> 06:55:43,390 Given seven doors or given\nmore generically n doors 7189 06:55:43,390 --> 06:55:47,980 how many times could she go left or go\n 7190 06:55:50,471 --> 06:55:53,148 What's the way to think about that? 7191 06:55:57,640 --> 06:56:00,515 And even if you're not feeling wholly\n 7192 06:56:00,515 --> 06:56:03,611 still, pretty much in programming and\n 7193 06:56:03,611 --> 06:56:06,791 any time we talk about some algorithm\n 7194 06:56:06,791 --> 06:56:10,541 in half, in half, in half,\nor any other multiple 7195 06:56:10,541 --> 06:56:12,941 it's probably involving\nlogarithms in some sense. 7196 06:56:12,940 --> 06:56:15,760 And log base n essentially\nrefers to the number 7197 06:56:15,760 --> 06:56:21,520 of times you can divide n by 2 until\n 7198 06:56:21,521 --> 06:56:23,771 or equivalently zero doors left. 7199 06:56:24,730 --> 06:56:28,631 So we might say that indeed,\nbinary search is in big O of log n 7200 06:56:28,631 --> 06:56:32,800 because the door that Rave\nopened last, this one 7201 06:56:32,800 --> 06:56:34,720 happened to be three doors away. 7202 06:56:34,721 --> 06:56:37,151 And actually, if you do\nthe math here, that roughly 7203 06:56:37,151 --> 06:56:38,961 works out to be exactly that case. 7204 06:56:38,960 --> 06:56:43,001 If we add one, that's sort of out\n 7205 06:56:43,001 --> 06:56:46,300 we were able to search it\nin just three total steps. 7206 06:56:46,300 --> 06:56:48,341 What about omega notation, though? 7207 06:56:48,341 --> 06:56:51,581 Like, in the best case, Rave\nmight have gotten lucky. 7208 06:56:51,580 --> 06:56:53,530 She opened the door, and there it is. 7209 06:56:53,530 --> 06:56:59,330 So how might we describe a lower bound\n 7210 06:57:03,580 --> 06:57:08,170 So here too, we see that in some cases\n 7211 06:57:08,170 --> 06:57:10,060 like, they're pretty equivalent. 7212 06:57:10,061 --> 06:57:15,191 And so this is why sometimes\n 7213 06:57:15,190 --> 06:57:17,650 case in the worst case\nbecause honestly, in general 7214 06:57:17,651 --> 06:57:19,961 who really cares if you just\nget lucky once in a while 7215 06:57:19,960 --> 06:57:21,640 and your algorithm is super fast? 7216 06:57:21,640 --> 06:57:24,611 What you probably care about\nis what's the worst case. 7217 06:57:25,751 --> 06:57:29,530 how long am I going to be sitting\n 7218 06:57:29,530 --> 06:57:35,167 or beach ball trying to give myself\n 7219 06:57:35,168 --> 06:57:38,001 Well, odds are, you're going to\n 7220 06:57:38,001 --> 06:57:39,791 So indeed, moving\nforward, will generally 7221 06:57:39,791 --> 06:57:43,091 talk about the running time of\n 7222 06:57:43,091 --> 06:57:45,140 a little less so in terms of omega. 7223 06:57:45,140 --> 06:57:47,501 But understanding the\nrange can be important 7224 06:57:47,501 --> 06:57:53,061 depending on the nature of the data that\n 7225 06:57:53,061 --> 06:57:55,871 All right let me pause and\nsee if there is any questions. 7226 06:58:03,210 --> 06:58:05,791 AUDIENCE: So this method\nis clearly more efficient 7227 06:58:05,791 --> 06:58:10,800 but it requires that the information\n 7228 06:58:10,800 --> 06:58:14,131 How do you ensure that you\ncan compile information 7229 06:58:14,131 --> 06:58:15,626 in a particular order at scale? 7230 06:58:15,626 --> 06:58:17,501 DAVID J. MALAN: Yeah, it's\na really good question. 7231 06:58:17,501 --> 06:58:20,376 And if I can generalize it, how do\n 7232 06:58:20,376 --> 06:58:22,920 at scale, which algorithm is better? 7233 06:58:22,920 --> 06:58:25,800 I've sort of led us down\nthis road of implying 7234 06:58:25,800 --> 06:58:27,900 that Rave's second\nalgorithm, binary search 7235 06:58:27,901 --> 06:58:29,941 is better because it's so much faster. 7236 06:58:29,940 --> 06:58:33,960 It's log of n in the worst\ncase instead of big O of n. 7237 06:58:33,960 --> 06:58:37,591 But Rave was given an advantage when\n 7238 06:58:38,582 --> 06:58:40,290 And so that sort of\ninvites the question 7239 06:58:40,291 --> 06:58:42,031 well, given a whole\nbunch of random data 7240 06:58:42,030 --> 06:58:45,070 either a small data set or, heck,\n 7241 06:58:45,070 --> 06:58:48,901 billions of pieces of data,\nshould you sort it first 7242 06:58:48,901 --> 06:58:51,511 from smallest to\nlargest and then search? 7243 06:58:51,510 --> 06:58:56,280 Or should you just dive right\nin and search it linearly? 7244 06:58:56,280 --> 06:58:57,998 Like, how might you think about that? 7245 06:58:57,998 --> 06:58:59,791 If you are Google, for\ninstance, and you've 7246 06:58:59,791 --> 06:59:03,451 got millions, billions of web pages,\n 7247 06:59:03,451 --> 06:59:06,210 because it's always going to work\neven though it might be slow? 7248 06:59:06,210 --> 06:59:09,181 Or should they invest the time\nin sorting all of that data-- 7249 06:59:10,991 --> 06:59:13,261 and then search it more efficiently? 7250 06:59:13,260 --> 06:59:15,798 Like, how do you decide\nbetween those options? 7251 06:59:15,798 --> 06:59:18,091 AUDIENCE: If you're sorting\nthe data, then wouldn't you 7252 06:59:18,091 --> 06:59:20,934 have to go through all of the data? 7253 06:59:20,934 --> 06:59:23,101 DAVID J. MALAN: Yeah, if you had\nto sort the data first-- 7254 06:59:23,100 --> 06:59:25,060 and we don't yet formally\nknow how to do this. 7255 06:59:25,061 --> 06:59:27,478 But obviously, as humans, we\ncould probably figure it out. 7256 06:59:27,477 --> 06:59:29,640 You do have to look at\nall of the data anyway. 7257 06:59:29,640 --> 06:59:33,120 And so you're sort of wasting your\n 7258 06:59:35,041 --> 06:59:37,259 But maybe it depends a bit more. 7259 06:59:37,258 --> 06:59:39,300 Like, that's absolutely\nright, and if you're just 7260 06:59:39,300 --> 06:59:42,420 searching for one thing in life,\n 7261 06:59:42,420 --> 06:59:46,080 to sort it and then search it because\n 7262 06:59:46,080 --> 06:59:48,240 But what's another scenario\nin which you might not 7263 06:59:48,241 --> 06:59:53,771 worry about that whereby it might\n 7264 06:59:54,271 --> 07:00:00,940 AUDIENCE: [INAUDIBLE] you can go\n 7265 07:00:00,940 --> 07:00:02,170 to find out what's happening. 7266 07:00:02,170 --> 07:00:03,212 DAVID J. MALAN: Yeah, exactly. 7267 07:00:03,212 --> 07:00:05,753 So if your problem is a\nGoogle-like problem where 7268 07:00:05,753 --> 07:00:08,710 you have more than just one user\n 7269 07:00:08,710 --> 07:00:11,140 website page, probably you\nshould incur the cost up front 7270 07:00:11,140 --> 07:00:14,980 and sort the whole thing because\n 7271 07:00:14,980 --> 07:00:17,170 is going to be faster,\nfaster, faster because it's 7272 07:00:17,170 --> 07:00:20,800 going to [INAUDIBLE] algorithm of binary\n 7273 07:00:20,800 --> 07:00:23,681 that's going to add up\nto be way fewer steps 7274 07:00:23,681 --> 07:00:25,971 than doing linear search multiple times. 7275 07:00:25,971 --> 07:00:27,851 So again, kind of\ndepends on the use case 7276 07:00:27,850 --> 07:00:29,710 and kind of depends on\nhow important it is. 7277 07:00:29,710 --> 07:00:32,411 And this happens even\nin real world contexts. 7278 07:00:32,411 --> 07:00:35,260 I think back always to graduate\n 7279 07:00:35,260 --> 07:00:36,970 to analyze some large data set. 7280 07:00:36,971 --> 07:00:39,761 And honestly, it was actually\neasier at the time for me 7281 07:00:39,760 --> 07:00:42,670 to write pretty inefficient\nbut hopefully correct 7282 07:00:42,670 --> 07:00:43,897 code because you know what? 7283 07:00:43,898 --> 07:00:47,230 I could just go to sleep for eight hours\n 7284 07:00:47,730 --> 07:00:50,695 I didn't have to bother writing\n 7285 07:00:50,695 --> 07:00:51,820 to run it more efficiently. 7286 07:00:52,320 --> 07:00:55,881 Because I was the only user, and I\n 7287 07:00:55,881 --> 07:00:58,061 And so this was kind of\na reasonable approach 7288 07:00:58,061 --> 07:01:01,911 reasonable until I woke up eight\n 7289 07:01:01,911 --> 07:01:05,320 And now I had to spend another eight\n 7290 07:01:05,320 --> 07:01:07,271 But even there, you\nsee an example where 7291 07:01:07,271 --> 07:01:09,251 what is your most precious resource? 7292 07:01:11,080 --> 07:01:13,330 Is it time to write the code? 7293 07:01:13,330 --> 07:01:15,458 Is it the amount of memory\nthe computer is using? 7294 07:01:15,458 --> 07:01:18,251 These are all resources we'll start\n 7295 07:01:18,251 --> 07:01:20,440 depends on what your goals are. 7296 07:01:20,440 --> 07:01:23,411 Any questions, then, on\nupper bounds, lower bounds 7297 07:01:23,411 --> 07:01:26,620 or each of these two\nsearches, linear or binary? 7298 07:01:27,351 --> 07:01:29,940 AUDIENCE: So just, when you're\ncalculating running time 7299 07:01:29,940 --> 07:01:34,677 does the sorting step\ncount for that time? 7300 07:01:34,677 --> 07:01:37,510 DAVID J. MALAN: When analyzing running\n 7301 07:01:37,510 --> 07:01:39,670 If you want it to if you actually do it. 7302 07:01:39,670 --> 07:01:41,260 At the moment, it did not apply. 7303 07:01:41,260 --> 07:01:45,460 I just gave Rave the luxury of\nknowing that the data was sorted. 7304 07:01:45,460 --> 07:01:48,880 But if I really wanted to charge\nher for the amount of time 7305 07:01:48,881 --> 07:01:52,091 it took to find that number six,\nI should have added the time 7306 07:01:52,091 --> 07:01:54,073 to sort plus the time to search. 7307 07:01:54,073 --> 07:01:55,780 And in fact, that's\na road we'll go down. 7308 07:01:55,780 --> 07:01:57,530 Why don't we go ahead and\npace ourselves as before? 7309 07:01:57,530 --> 07:01:58,948 Let's take a 10 minute break here. 7310 07:01:58,948 --> 07:02:01,491 And when we come back, we'll\nwrite some actual code. 7311 07:02:01,491 --> 07:02:05,399 So we've seen a couple of searches--\n 7312 07:02:05,399 --> 07:02:06,941 to be fair, we saw back in week zero. 7313 07:02:06,940 --> 07:02:10,150 But let's actually translate at\n 7314 07:02:10,151 --> 07:02:13,331 using this building block from\nlast week where we can actually 7315 07:02:13,330 --> 07:02:17,180 define an array if we want, like an\n 7316 07:02:17,181 --> 07:02:18,911 So let me switch over to VS Code here. 7317 07:02:18,911 --> 07:02:22,271 Let me go ahead and start\na program called numbers.c. 7318 07:02:22,271 --> 07:02:25,300 And in numbers.c, let me go ahead here. 7319 07:02:25,300 --> 07:02:29,201 And how about let's include\nour familiar header files? 7320 07:02:31,030 --> 07:02:35,690 I'll include standardio.h that we can\n 7321 07:02:35,690 --> 07:02:38,771 And now I'm going to go ahead\nand give myself int main void. 7322 07:02:38,771 --> 07:02:40,460 No command line arguments today. 7323 07:02:40,460 --> 07:02:41,593 So I'll leave that as void. 7324 07:02:41,593 --> 07:02:43,300 And I'm going to go\nahead and give myself 7325 07:02:43,300 --> 07:02:45,771 an array of how about seven numbers? 7326 07:02:45,771 --> 07:02:48,581 So I'll call it int number 7. 7327 07:02:48,580 --> 07:02:50,620 And then I can fill\nthis array with numbers. 7328 07:02:50,620 --> 07:02:54,460 Like, numbers brackets 0 can be\n 7329 07:02:54,460 --> 07:02:58,668 could be the number 6, and numbers\n 7330 07:02:58,669 --> 07:03:01,211 And this is the same list that\nwe saw with [? Nomira ?] a bit 7331 07:03:01,210 --> 07:03:03,530 ago where it was 4, then 6, then 8. 7332 07:03:04,280 --> 07:03:06,700 There's actually another\nsyntax I can show you here. 7333 07:03:06,701 --> 07:03:09,581 If you know in advance\nin a C program that you 7334 07:03:09,580 --> 07:03:14,750 want an array of certain values and you\n 7335 07:03:14,751 --> 07:03:17,771 you want, you can actually do this\n 7336 07:03:17,771 --> 07:03:20,921 You can say, don't worry\nabout how big this is. 7337 07:03:20,920 --> 07:03:23,980 It's going to be implicit by\nway of these curly braces. 7338 07:03:23,980 --> 07:03:28,971 Here, I can do 4, 6, 8, 2,\n7, 5, 0, close curly brace. 7339 07:03:28,971 --> 07:03:31,091 So it's a somewhat new\nuse of curly braces. 7340 07:03:31,091 --> 07:03:35,140 But this has the effect of giving\n 7341 07:03:35,140 --> 07:03:36,800 of which are a whole bunch of integers. 7342 07:03:37,390 --> 07:03:41,320 The compiler can infer it from what's\n 7343 07:03:41,320 --> 07:03:44,501 And it seems to be of\nsize 1, 2, 3, 4, 5, 6, 7. 7344 07:03:44,501 --> 07:03:49,870 And all seven elements will be\n 7345 07:03:50,690 --> 07:03:53,140 So just a minor optimization\ncode wise to tighten up 7346 07:03:53,140 --> 07:03:56,451 what would have otherwise been\n 7347 07:03:56,451 --> 07:03:59,441 Now let's go ahead and implement\nlinear search, as we called it. 7348 07:03:59,440 --> 07:04:02,483 And you can do this in a bunch of\n 7349 07:04:02,483 --> 07:04:09,190 For int i get 0, i is\nless than 7 i plus plus. 7350 07:04:09,190 --> 07:04:12,161 Then inside of my loop, I'm\ngoing to ask the question, well 7351 07:04:12,161 --> 07:04:17,381 if the numbers at location i\nequals equals, as we asked of 7352 07:04:17,381 --> 07:04:21,041 [? Nomira, ?] the number 0, then I'm\n 7353 07:04:21,041 --> 07:04:25,811 like printf found backslash n. 7354 07:04:25,811 --> 07:04:27,641 And then I'm going to return 0. 7355 07:04:27,640 --> 07:04:30,400 Just because of last week's\ndiscussion of returning 7356 07:04:30,401 --> 07:04:34,511 a value for main when all is well,\n 7357 07:04:34,510 --> 07:04:37,420 just to signal that indeed,\nI found what I'm looking for. 7358 07:04:37,420 --> 07:04:44,920 Otherwise, on what line do I want to\n 7359 07:04:44,920 --> 07:04:46,960 and return something other than 0? 7360 07:04:46,960 --> 07:04:51,220 Right, I don't think I want an else\n 7361 07:04:51,221 --> 07:04:55,390 So on what line would you prefer I\n 7362 07:04:55,390 --> 07:04:58,850 of not found and I'll return an error? 7363 07:05:04,370 --> 07:05:06,078 So at the end of the\nfor loop because you 7364 07:05:06,079 --> 07:05:07,911 want to give the\nprogram or our volunteer 7365 07:05:07,911 --> 07:05:11,341 earlier a chance to go through all\n 7366 07:05:11,341 --> 07:05:14,061 But if you go through the whole\nthing, through the whole loop 7367 07:05:14,061 --> 07:05:17,991 at the very end, you probably just\n 7368 07:05:17,991 --> 07:05:20,421 and then return\nsomething like positive 1 7369 07:05:20,420 --> 07:05:22,400 just to signify that an error happened. 7370 07:05:22,401 --> 07:05:24,531 And again, this was a\nminor detail last week. 7371 07:05:24,530 --> 07:05:28,730 Any time main is successful, the\n 7372 07:05:30,210 --> 07:05:33,380 And if something goes wrong, like you\n 7373 07:05:33,381 --> 07:05:37,311 you might return something other than\n 7374 07:05:37,311 --> 07:05:39,563 or even negative numbers if you want. 7375 07:05:39,562 --> 07:05:41,520 All right, well, let me\ngo ahead and save this. 7376 07:05:49,401 --> 07:05:51,861 All right, and it's found,\nas I would hope it would be. 7377 07:05:51,861 --> 07:05:55,070 And just as a little check, let's\n 7378 07:05:55,070 --> 07:05:59,151 not there, like the number negative 1. 7379 07:05:59,151 --> 07:06:02,001 Let me go ahead and recompile\nthe code with make numbers. 7380 07:06:02,001 --> 07:06:04,251 Let me rerun the code\nwith dot slash numbers 7381 07:06:04,251 --> 07:06:06,260 and hopefully-- whew, OK, not found. 7382 07:06:06,260 --> 07:06:08,833 So proof by example seems\nto be working correctly. 7383 07:06:08,833 --> 07:06:11,001 But let's make things a\nlittle more interesting now. 7384 07:06:11,001 --> 07:06:14,061 Right now, I'm using just\nan array of integers. 7385 07:06:14,061 --> 07:06:18,451 Let me go ahead and introduce\nmaybe an array of strings instead. 7386 07:06:18,451 --> 07:06:21,721 And maybe this time, I'll store a\n 7387 07:06:21,721 --> 07:06:23,461 but actual strings of names. 7388 07:06:24,690 --> 07:06:26,490 Well, let me go back to my code here. 7389 07:06:26,491 --> 07:06:30,831 I'm going to switch us over to\nmaybe a file called names.c. 7390 07:06:30,830 --> 07:06:34,490 And in here, I'll go\nahead and include cs50.h. 7391 07:06:37,911 --> 07:06:41,390 And I'm going to go ahead and\nfor now include a new friend 7392 07:06:41,390 --> 07:06:44,838 from last week, string.h, which gives\n 7393 07:06:44,838 --> 07:06:47,631 Int main void because I'm not going\n 7394 07:06:48,841 --> 07:06:53,690 And now if I want an array of strings,\n 7395 07:06:56,480 --> 07:06:58,460 And then I could start\ndoing like before. 7396 07:06:58,460 --> 07:07:01,940 Names bracket 0 could be someone\nlike Bill, and names bracket 1 7397 07:07:01,940 --> 07:07:05,100 could be someone like\nCharlie and so forth. 7398 07:07:05,100 --> 07:07:08,712 But there's this new\nimprovement I can make. 7399 07:07:08,712 --> 07:07:11,420 Let me just let the compiler figure\n 7400 07:07:11,420 --> 07:07:16,911 And using curly braces, I'll do Bill\n 7401 07:07:16,911 --> 07:07:24,050 George and then Ginny and then Percy and\n 7402 07:07:24,050 --> 07:07:27,291 All right, so now I have\nthese seven names as strings. 7403 07:07:32,091 --> 07:07:35,690 i is less than 7 as before,\ni plus plus as before. 7404 07:07:35,690 --> 07:07:39,260 And inside of the, loop lets this\n 7405 07:07:39,260 --> 07:07:41,990 and suppose we're searching\nfor Ron arbitrarily. 7406 07:07:41,991 --> 07:07:44,451 He is there, so we should\neventually find him. 7407 07:07:44,451 --> 07:07:51,890 Let me go ahead and say if names bracket\n 7408 07:07:51,890 --> 07:07:55,701 of my if condition, I'm going to\n 7409 07:07:55,701 --> 07:07:57,951 And I'm going to return 0\njust because all is well. 7410 07:07:57,951 --> 07:08:00,751 And I'm going to take your\nadvice from the get go this time 7411 07:08:00,751 --> 07:08:04,920 and, at the end of the loop, print out\n 7412 07:08:04,920 --> 07:08:08,030 I have not printed found, and\nI have not returned already. 7413 07:08:08,030 --> 07:08:12,200 So I'm just going to go ahead and\n 7414 07:08:12,201 --> 07:08:14,781 All right, let me go ahead and\ncross my fingers as always. 7415 07:08:17,670 --> 07:08:20,661 And it doesn't seem\nto like my code here. 7416 07:08:20,661 --> 07:08:22,730 This is perhaps a new\nerror that you might not 7417 07:08:22,730 --> 07:08:25,440 have seen yet in names.c line 11. 7418 07:08:25,440 --> 07:08:28,280 So that's this line\nhere, my if condition. 7419 07:08:28,280 --> 07:08:32,330 Result of comparison against a\nstring literal is unspecified. 7420 07:08:32,330 --> 07:08:34,752 Use an explicit string\ncomparison function instead. 7421 07:08:34,753 --> 07:08:37,460 I mean, that's kind of a mouthful,\n 7422 07:08:37,460 --> 07:08:39,960 you're probably not going to\nknow how to make sense of that. 7423 07:08:39,960 --> 07:08:43,490 But it does kind of draw our\nattention to something being awry 7424 07:08:43,491 --> 07:08:48,201 with the equality checking\nhere, with equal equals and Ron. 7425 07:08:48,201 --> 07:08:50,601 And here's where again\nwe've been telling 7426 07:08:50,600 --> 07:08:53,060 sort of a white lie for\nthe past couple of weeks. 7427 07:08:53,061 --> 07:08:57,261 Strings are a thing in C. Strings\nare a thing in programming. 7428 07:08:57,260 --> 07:08:59,030 But recall from last\nweek, I did disclaim 7429 07:08:59,030 --> 07:09:01,010 there's no such thing\nas a string data type 7430 07:09:01,010 --> 07:09:05,030 technically because it's not\na primitive in the way an int 7431 07:09:05,030 --> 07:09:08,570 and a float and a bool are that are\n 7432 07:09:08,570 --> 07:09:12,530 You can't just use equation\nequals to compare two strings. 7433 07:09:12,530 --> 07:09:15,470 You actually have to use\na special function that's 7434 07:09:15,471 --> 07:09:18,501 in this header file we talked\nbriefly about last week. 7435 07:09:18,501 --> 07:09:21,140 In that header file was\nstring length or strlen. 7436 07:09:21,140 --> 07:09:23,850 But there's other\nfunctions instead as well. 7437 07:09:23,850 --> 07:09:27,440 Let me, in fact, go ahead\nand open up the manual pages. 7438 07:09:32,120 --> 07:09:37,161 In string.h you can perhaps infer\n 7439 07:09:37,161 --> 07:09:40,973 the place of equals equals for today. 7440 07:09:43,041 --> 07:09:47,666 DAVID J. MALAN: So strcmp, S-T-R-C-M-P,\n 7441 07:09:47,666 --> 07:09:49,791 And if I click on that,\nwe'll see more information. 7442 07:09:49,791 --> 07:09:53,871 And indeed, if I click on strcmp,\nwe'll see under the synopsis 7443 07:09:53,870 --> 07:09:58,850 that, OK, I need to use the CS50 header\n 7444 07:09:58,850 --> 07:10:02,210 Here is its prototype,\nwhich is telling me 7445 07:10:02,210 --> 07:10:05,720 that strcmp takes two\nstrings, S1 and S2, that 7446 07:10:05,721 --> 07:10:07,251 are presumably going to be compared. 7447 07:10:07,251 --> 07:10:09,591 And it returns an integer,\nwhich is interesting. 7448 07:10:10,791 --> 07:10:14,091 The description of this function is\n 7449 07:10:15,230 --> 07:10:18,471 So uppercase or lowercase\nmatters, just FYI. 7450 07:10:18,471 --> 07:10:20,931 And then let's look it\nthe return value here. 7451 07:10:20,931 --> 07:10:25,221 The return value of this function\nreturns an int less than 0 7452 07:10:25,221 --> 07:10:32,841 if S1 comes before S2, 0 if S1 is the\n 7453 07:10:35,300 --> 07:10:39,140 So the reason that this function\n 7454 07:10:39,140 --> 07:10:41,751 bool, true or false, is\nthat it actually will 7455 07:10:41,751 --> 07:10:45,111 allow us to sort these things\n 7456 07:10:45,111 --> 07:10:49,341 if two strings come in this order or\n 7457 07:10:49,341 --> 07:10:51,440 you need three possible return values. 7458 07:10:51,440 --> 07:10:53,271 And a bool, of course,\nonly gives you two 7459 07:10:53,271 --> 07:10:56,841 but an int gives you like 4 billion\n 7460 07:10:56,841 --> 07:11:01,881 So 0 or a positive number or a negative\n 7461 07:11:01,881 --> 07:11:06,320 And the documentation goes on to explain\n 7462 07:11:06,320 --> 07:11:09,620 Recall that capital A\nis 65, capital B is 66 7463 07:11:09,620 --> 07:11:12,021 and it's those underlying\nASCII or Unicode 7464 07:11:12,021 --> 07:11:15,411 numbers that a computer uses to figure\n 7465 07:11:15,411 --> 07:11:17,541 or after it like in the dictionary. 7466 07:11:17,541 --> 07:11:20,311 But for our purposes now,\nwe only care about equality. 7467 07:11:20,311 --> 07:11:22,041 So I'm going to go ahead and do this. 7468 07:11:22,041 --> 07:11:26,181 If I want to compare names\nbracket i against Ron 7469 07:11:26,181 --> 07:11:33,681 I use stir compare or strcmp, names\n 7470 07:11:33,681 --> 07:11:35,870 So it's a little more\ninvolved than actually 7471 07:11:35,870 --> 07:11:40,190 using equals equals, which\ndoes work for integers, longs 7472 07:11:41,361 --> 07:11:45,210 But for strings, it turns out we\n 7473 07:11:45,710 --> 07:11:47,990 Well, last week, recall\nwhat a string really is. 7474 07:11:47,991 --> 07:11:50,581 It's an array of characters. 7475 07:11:50,580 --> 07:11:54,050 And so whereas you can use equals\nequals for single characters 7476 07:11:54,050 --> 07:11:56,960 strcmp, as we'll\neventually see, is going 7477 07:11:56,960 --> 07:11:58,798 to compare multiple characters for us. 7478 07:11:59,841 --> 07:12:03,931 There's a loop needed, and that's\n 7479 07:12:03,931 --> 07:12:06,651 But it doesn't just work out of\n 7480 07:12:06,651 --> 07:12:10,628 That would literally be comparing\n 7481 07:12:10,628 --> 07:12:12,710 And we'll come back to\nthis next week as to what's 7482 07:12:12,710 --> 07:12:14,280 really going on under the hood. 7483 07:12:14,280 --> 07:12:18,501 So let me go ahead and fix one\nbug that I just realized I made. 7484 07:12:18,501 --> 07:12:23,811 I want to check if the return\nvalue of str compare is equal to 0 7485 07:12:23,811 --> 07:12:27,021 because per the documentation,\nthat meant they're the same. 7486 07:12:27,021 --> 07:12:30,001 All right, let me go ahead\nand make names this time. 7487 07:12:31,070 --> 07:12:34,131 Dot slash names, Enter, found. 7488 07:12:34,131 --> 07:12:39,651 And just as a sanity check, let's\n 7489 07:12:39,651 --> 07:12:43,371 Searching now for Hermione\nafter recompiling the code 7490 07:12:44,931 --> 07:12:46,640 And she's not, in fact, found. 7491 07:12:46,640 --> 07:12:49,580 So here's just a similar\nimplementation of linear search 7492 07:12:49,580 --> 07:12:53,930 not for integers this time\nbut instead for strings 7493 07:12:53,931 --> 07:12:57,501 the subtlety really being we need\n 7494 07:12:57,501 --> 07:13:02,001 to actually do the legwork for us of\n 7495 07:13:02,001 --> 07:13:05,568 All right, questions on either of these\n 7496 07:13:05,568 --> 07:13:07,253 AUDIENCE: So, if I do [INAUDIBLE] 7497 07:13:07,253 --> 07:13:08,460 DAVID J. MALAN: Ah, good question. 7498 07:13:08,460 --> 07:13:12,620 If I had not fixed what I claimed was\n 7499 07:13:12,620 --> 07:13:15,201 and we saw an example of\nthis last week, actually. 7500 07:13:15,201 --> 07:13:21,831 If a function returns an integer,\n 7501 07:13:21,830 --> 07:13:25,100 when you get back 0, the\nexpression, the Boolean expression 7502 07:13:29,300 --> 07:13:33,830 If a function returns any positive\n 7503 07:13:33,830 --> 07:13:37,130 that's going to be\ninterpreted as true even 7504 07:13:37,131 --> 07:13:41,871 if it's positive or negative, whether\n 7505 07:13:41,870 --> 07:13:45,931 And so if I did this, this\nwould be saying the opposite. 7506 07:13:45,931 --> 07:13:51,291 So if I were to say this, if str compare\n 7507 07:13:51,291 --> 07:13:57,754 implicitly like saying this does not\n 7508 07:13:57,754 --> 07:14:00,171 but you don't want to check\nfor true because, again, we're 7509 07:14:01,710 --> 07:14:05,361 So the reason I did 0\nhere in this case is 7510 07:14:05,361 --> 07:14:08,971 that it explicitly checks for the return\n 7511 07:14:15,309 --> 07:14:17,351 DAVID J. MALAN: Yes, you might\nnot have seen this yet 7512 07:14:17,350 --> 07:14:20,940 but you can express the\nequivalent because if you 7513 07:14:20,940 --> 07:14:24,661 want to check if this is\nfalse, you can actually 7514 07:14:24,661 --> 07:14:27,661 use an exclamation point,\nknown as a bang in programming 7515 07:14:29,320 --> 07:14:32,524 So false becomes true,\ntrue becomes false. 7516 07:14:32,524 --> 07:14:34,441 So this would be another\nway of expressing it. 7517 07:14:34,440 --> 07:14:39,300 This is arguably a worse design, though,\n 7518 07:14:39,300 --> 07:14:43,021 says you should be checking\nfor 0 or a positive value 7519 07:14:43,021 --> 07:14:46,111 or a negative value, and this\nlittle trick, while correct 7520 07:14:46,111 --> 07:14:49,861 and I think you can make a reasonable\n 7521 07:14:49,861 --> 07:14:51,780 And I would argue instead\nfor the first way 7522 07:14:51,780 --> 07:14:53,968 checking for equals equals 0 instead. 7523 07:14:53,968 --> 07:14:55,800 And if that's a little\nsubtle, not to worry. 7524 07:14:55,800 --> 07:15:00,600 We'll come back to little syntactic\n 7525 07:15:00,600 --> 07:15:05,130 Other questions on linear\nsearch in these two forms. 7526 07:15:05,131 --> 07:15:06,811 Is there another hand or hands? 7527 07:15:08,971 --> 07:15:10,261 OK, just holler if I missed. 7528 07:15:10,260 --> 07:15:12,372 So let's now actually take\nthis one step further. 7529 07:15:12,372 --> 07:15:15,330 Suppose that we want to write a\n 7530 07:15:15,330 --> 07:15:19,470 a little more like a phone book that\n 7531 07:15:19,471 --> 07:15:21,390 just integers but actual phone numbers. 7532 07:15:21,390 --> 07:15:23,701 Well, we could escalate\nthings like this. 7533 07:15:23,701 --> 07:15:27,210 We could now have two arrays-- one\n 7534 07:15:27,210 --> 07:15:29,730 And I'm going to use\nstrings for the numbers now 7535 07:15:29,730 --> 07:15:32,370 the phone numbers, because\nin most communities 7536 07:15:32,370 --> 07:15:36,091 phone numbers might have dashes,\n 7537 07:15:36,091 --> 07:15:39,390 that really looks more like a string\n 7538 07:15:39,390 --> 07:15:42,940 Probably don't want to use an int lest\n 7539 07:15:42,940 --> 07:15:47,521 So let me switch back to VS Code here,\n 7540 07:15:47,521 --> 07:15:49,394 in a file called phonebook.c. 7541 07:15:49,394 --> 07:15:51,061 And now let me go ahead and do the same. 7542 07:15:52,741 --> 07:15:58,681 Let me include standardio.h,\nand let me include string.h. 7543 07:15:58,681 --> 07:16:01,741 I'm going to again do int main void. 7544 07:16:01,741 --> 07:16:05,071 And then inside of my program, I'm\n 7545 07:16:06,870 --> 07:16:09,361 String names will be\njust two of us this time. 7546 07:16:12,751 --> 07:16:15,241 And then I'll give myself--\noops, typo already. 7547 07:16:15,241 --> 07:16:18,151 If I want this to be an array, I\n 7548 07:16:18,151 --> 07:16:19,741 The compiler can count for me. 7549 07:16:19,741 --> 07:16:21,661 But I do need the square brackets. 7550 07:16:21,661 --> 07:16:28,111 Then for numbers, I'm again going to\n 7551 07:16:28,111 --> 07:16:33,870 the curly braces that how about\nCarter can be at 1-617-495-1000. 7552 07:16:33,870 --> 07:16:35,640 And how about my own number here-- 7553 07:16:35,640 --> 07:16:39,361 1-949-468-- oh pattern appearing-- 7554 07:16:42,960 --> 07:16:44,890 Well, I'm just kind of lined things up. 7555 07:16:44,890 --> 07:16:47,820 So Carter's number is\napparently first in this array 7556 07:16:47,820 --> 07:16:51,161 and I'm claiming that he'll be\n 7557 07:16:51,161 --> 07:16:53,971 I, David, will be the first--\nthe second in the names array 7558 07:16:53,971 --> 07:16:56,628 and second in the numbers array. 7559 07:16:56,628 --> 07:16:59,460 If you want to have a little fun\n 7560 07:16:59,460 --> 07:17:01,630 or call me some time at that number. 7561 07:17:01,631 --> 07:17:05,311 So now let's actually use\nthis data in some way. 7562 07:17:05,311 --> 07:17:08,729 Let's go ahead and actually search\n 7563 07:17:11,850 --> 07:17:16,450 There's two of us this time-- so i less\n 7564 07:17:16,451 --> 07:17:18,841 And now I'm going to practice\nwhat I preached earlier 7565 07:17:18,841 --> 07:17:22,800 and I'm going to use str compare\nto find my name in this case. 7566 07:17:22,800 --> 07:17:29,460 And I'm going to say if strcmp of names\n 7567 07:17:29,460 --> 07:17:33,100 and that equals 0,\nmeaning they're the same 7568 07:17:33,100 --> 07:17:35,970 then just as before, I'm going to\n 7569 07:17:35,971 --> 07:17:37,681 But this time, I'm going to\nmake the program more useful 7570 07:17:37,681 --> 07:17:39,460 and not just say found or not found. 7571 07:17:39,460 --> 07:17:43,411 Now I'm implementing a phone book, like\n 7572 07:17:43,411 --> 07:17:46,741 So I'm going to say something\nlike, quote unquote, found percent 7573 07:17:46,741 --> 07:17:53,191 s backslash n and then actually\nplug in numbers bracket i 7574 07:17:53,190 --> 07:17:56,730 to correspond to the\ncurrent name bracket i. 7575 07:17:56,730 --> 07:17:58,585 And then I'll return 0 as before. 7576 07:17:58,585 --> 07:18:00,960 And then down here if we get\nall the way through the loop 7577 07:18:00,960 --> 07:18:04,480 and David's not there for some reason,\n 7578 07:18:05,940 --> 07:18:10,950 So let me go ahead and compile this\n 7579 07:18:10,951 --> 07:18:13,601 and it seems to have found the number. 7580 07:18:13,600 --> 07:18:17,490 So this code I'm going\nto claim is correct. 7581 07:18:17,491 --> 07:18:20,551 It's kind of stupid because I've\n 7582 07:18:20,550 --> 07:18:22,050 app that only supports two people. 7583 07:18:22,050 --> 07:18:23,863 They're only going to be me and Carter. 7584 07:18:23,864 --> 07:18:26,281 This would be like downloading\nthe contacts app on a phone 7585 07:18:26,280 --> 07:18:28,198 and you can only call\ntwo people in the world. 7586 07:18:28,198 --> 07:18:30,181 There's no ability to\nadd names or edit things. 7587 07:18:30,181 --> 07:18:33,221 That, of course, could come later\n 7588 07:18:33,221 --> 07:18:35,054 But for now for the\nsake of discussion, I've 7589 07:18:35,054 --> 07:18:37,811 just hardcoded two\nnames and two numbers. 7590 07:18:37,811 --> 07:18:40,651 But for what it does, I\nclaim this is correct. 7591 07:18:40,651 --> 07:18:43,951 It's going to find me\nand print out my number. 7592 07:18:45,841 --> 07:18:49,921 Let's start to now consider if\nwe're not just using arrays 7593 07:18:49,920 --> 07:18:52,212 but are we using them, well? 7594 07:18:52,212 --> 07:18:55,170 We started to use them last week,\n 7595 07:18:55,170 --> 07:18:59,040 And what might I even mean by\nusing an array well or designing 7596 07:19:01,001 --> 07:19:06,300 Any critiques or concerns\nwith why this might not 7597 07:19:06,300 --> 07:19:08,460 be the best road for us\nto be going down when 7598 07:19:08,460 --> 07:19:12,900 I want to implement something like a\n 7599 07:19:12,901 --> 07:19:15,901 It seems all too vulnerable\nto just mistakes. 7600 07:19:15,901 --> 07:19:19,981 For instance, if I screw up the actual\n 7601 07:19:19,980 --> 07:19:24,550 such that it's now more or less than\n 7602 07:19:24,550 --> 07:19:27,780 it feels like there's not a tight\n 7603 07:19:27,780 --> 07:19:31,501 of data, and it's just sort of\nis trusting on the honor system 7604 07:19:31,501 --> 07:19:37,800 that any time I use names bracket i\n 7605 07:19:38,521 --> 07:19:40,440 If you're the one writing\nthe code, you're probably 7606 07:19:40,440 --> 07:19:42,001 not going to really screw this up. 7607 07:19:42,001 --> 07:19:44,626 But if you start collaborating\nwith someone else or the program 7608 07:19:44,626 --> 07:19:48,061 is getting much, much longer, the\n 7609 07:19:48,061 --> 07:19:52,471 remember that you're sort of just\n 7610 07:19:52,471 --> 07:19:54,781 like this is going to fail eventually. 7611 07:19:54,780 --> 07:19:57,900 Someone's not going to realize that,\n 7612 07:19:57,901 --> 07:20:00,901 And you're going to start out putting\n 7613 07:20:00,901 --> 07:20:05,070 is to say it'd be much nicer if\n 7614 07:20:05,070 --> 07:20:09,210 pieces of data, names and numbers,\n 7615 07:20:09,210 --> 07:20:13,260 that you're not just trusting that\n 7616 07:20:13,260 --> 07:20:16,990 and numbers, have this kind of\nrelationship with themselves. 7617 07:20:16,991 --> 07:20:19,561 So let's consider how\nwe might solve this. 7618 07:20:19,561 --> 07:20:23,761 A new feature today that we'll introduce\n 7619 07:20:23,760 --> 07:20:27,640 In C, we have the ability to\ninvent our own data types 7620 07:20:27,640 --> 07:20:30,916 if you will-- data types\nthat the authors of C decades 7621 07:20:30,916 --> 07:20:32,791 ago just didn't envision\nor just didn't think 7622 07:20:32,791 --> 07:20:36,241 were necessary because we can implement\n 7623 07:20:36,241 --> 07:20:38,641 just as you could create\ncustom puzzle pieces 7624 07:20:38,640 --> 07:20:40,721 or in C, you can create\ncustom functions. 7625 07:20:40,721 --> 07:20:45,121 So in C, can you create\nyour own types of data 7626 07:20:45,120 --> 07:20:49,260 that go beyond the built in ints\nand floats and even strings? 7627 07:20:49,260 --> 07:20:54,900 You can make, for instance, a\n 7628 07:20:54,901 --> 07:20:57,451 type in the context of\nelections or a person data type 7629 07:20:57,451 --> 07:21:00,311 more generically that might\nhave a name and a number. 7630 07:21:02,080 --> 07:21:07,830 Well, let me go here and propose\n 7631 07:21:07,830 --> 07:21:11,280 wouldn't it be nice if we\ncould have a person data type 7632 07:21:11,280 --> 07:21:13,830 and then we could have\nan array called people? 7633 07:21:13,830 --> 07:21:17,130 And maybe that array is our\nonly array with two things 7634 07:21:19,561 --> 07:21:22,501 But somehow, those data\ntypes, these persons 7635 07:21:22,501 --> 07:21:25,519 would have both a name and a\nnumber associated with them. 7636 07:21:25,519 --> 07:21:27,061 So we don't need two separate arrays. 7637 07:21:27,061 --> 07:21:31,601 We need one array of persons,\na brand new data type. 7638 07:21:33,251 --> 07:21:35,431 Well, if we want every\nperson in the world 7639 07:21:35,431 --> 07:21:37,681 or in this program to\nhave a name and a number 7640 07:21:37,681 --> 07:21:40,741 we literally right out\nfirst those two data types. 7641 07:21:40,741 --> 07:21:42,221 Give me a string called name. 7642 07:21:42,221 --> 07:21:45,211 Give me a string called\nnumber semicolon, after each. 7643 07:21:45,210 --> 07:21:48,390 And then we wrap that,\nthose two lines of code 7644 07:21:48,390 --> 07:21:51,091 with this syntax, which at first\nglance is a little cryptic. 7645 07:21:51,091 --> 07:21:52,771 It's a lot of words all of a sudden. 7646 07:21:52,771 --> 07:21:57,091 But typedef is a new keyword today\nthat defines a new data type. 7647 07:21:57,091 --> 07:22:00,870 This is the C key word that\nlets you create your own data 7648 07:22:00,870 --> 07:22:02,460 type for the very first time. 7649 07:22:02,460 --> 07:22:07,201 Struct is another related key word that\n 7650 07:22:07,201 --> 07:22:11,671 a simple data type, like an int or a\n 7651 07:22:13,300 --> 07:22:17,070 It's got some dimensions to it, like\n 7652 07:22:17,070 --> 07:22:19,620 or even 50 things inside of it. 7653 07:22:19,620 --> 07:22:23,670 The last word down here is the name\n 7654 07:22:23,670 --> 07:22:26,340 and it weirdly goes\nafter the curly braces. 7655 07:22:26,341 --> 07:22:30,120 But this is how you invent\na data type called person. 7656 07:22:30,120 --> 07:22:33,030 And what this code is\nimplying is that henceforth 7657 07:22:33,030 --> 07:22:38,341 the compiler clang will know that a\n 7658 07:22:38,341 --> 07:22:41,131 a string and a number that's a string. 7659 07:22:41,131 --> 07:22:44,131 And you don't have to worry\nabout having multiple arrays now. 7660 07:22:44,131 --> 07:22:48,311 You can just have an array\nof people moving forward. 7661 07:22:48,311 --> 07:22:50,279 So how can we go about using this? 7662 07:22:50,278 --> 07:22:52,320 Well, let me go back to\nmy code from before where 7663 07:22:52,320 --> 07:22:53,701 I was implementing a phone book. 7664 07:22:53,701 --> 07:22:56,076 And why don't we enhance the\nphone book code a little bit 7665 07:22:56,076 --> 07:22:58,591 by borrowing some of that new syntax? 7666 07:22:58,591 --> 07:23:01,081 Let me go to the top of\nmy program above main 7667 07:23:01,080 --> 07:23:04,020 and define a type that's\na structure or a data 7668 07:23:04,021 --> 07:23:08,861 structure that has a name inside of\n 7669 07:23:08,861 --> 07:23:12,511 And the name of this new structure\n 7670 07:23:12,510 --> 07:23:17,911 Inside of my code now, let me go ahead\n 7671 07:23:17,911 --> 07:23:21,870 Let me give myself an array\ncalled people of size 2. 7672 07:23:21,870 --> 07:23:25,358 And I'm going to use the\nnon-terse way to do this. 7673 07:23:25,358 --> 07:23:26,940 I'm not going to use the curly braces. 7674 07:23:26,940 --> 07:23:31,501 I'm going to more pedantic spell out\n 7675 07:23:31,501 --> 07:23:35,221 at location 0, which is the\nfirst person in an array 7676 07:23:35,221 --> 07:23:37,051 because you always start counting at 0. 7677 07:23:37,050 --> 07:23:40,861 I'm going to give that person\na name of quote unquote Carter. 7678 07:23:40,861 --> 07:23:44,341 And the dot is admittedly one\nnew piece of syntax today too. 7679 07:23:44,341 --> 07:23:46,771 The dot means go inside\nof that structure 7680 07:23:46,771 --> 07:23:50,550 and access the variable called\n 7681 07:23:50,550 --> 07:23:52,710 Similarly, if I'm going\nto give Carter a number 7682 07:23:52,710 --> 07:23:57,390 I can go into people bracket 0 dot\n 7683 07:23:57,390 --> 07:24:02,310 as before plus 1-617-495-1000. 7684 07:24:02,311 --> 07:24:04,591 And then I can do the\nsame for myself here-- 7685 07:24:04,591 --> 07:24:08,506 people bracket-- where should I go? 7686 07:24:08,506 --> 07:24:10,468 OK, one because again, two elements. 7687 07:24:10,468 --> 07:24:11,800 But we started counting at zero. 7688 07:24:11,800 --> 07:24:13,780 Bracket name equals quote unquote David. 7689 07:24:13,780 --> 07:24:18,700 And then lastly, people bracket 1\n 7690 07:24:24,730 --> 07:24:27,611 So now if I scroll\ndown here to my logic 7691 07:24:27,611 --> 07:24:30,491 I don't think this part\nneeds to change too much. 7692 07:24:30,491 --> 07:24:35,041 I'm still, for the sake of discussion,\n 7693 07:24:35,041 --> 07:24:37,991 is 0 on up to but not through 2. 7694 07:24:37,991 --> 07:24:41,031 But I think this line\nof code needs to change. 7695 07:24:41,030 --> 07:24:49,271 How should I now refer to the\ni-th person's name as I iterate? 7696 07:24:49,271 --> 07:24:52,611 What should I compare quote\nunquote David to this time? 7697 07:24:54,411 --> 07:24:57,438 AUDIENCE: People bracket i dot name. 7698 07:24:57,438 --> 07:24:59,230 DAVID J. MALAN: Yeah, people\nbracket i dot name. 7699 07:24:59,530 --> 07:25:01,198 Because people is the name of the array. 7700 07:25:01,198 --> 07:25:04,841 Bracket i is the i-th person that we're\n 7701 07:25:04,841 --> 07:25:07,600 first zero, then one, maybe\nhigher if it had more people. 7702 07:25:07,600 --> 07:25:10,931 Then dot is our new syntax for\ngoing inside of a data structure 7703 07:25:10,931 --> 07:25:14,230 and accessing a variable therein\nwhich in this case is name. 7704 07:25:14,230 --> 07:25:16,640 And so I can compare\nDavid just as before. 7705 07:25:16,640 --> 07:25:21,251 So it's a little more verbose, but\n 7706 07:25:21,251 --> 07:25:26,831 because now these people are full\n 7707 07:25:26,830 --> 07:25:29,170 There's no more honor\nsystem inside of my loop 7708 07:25:29,170 --> 07:25:31,462 that this is going to line\nup because in just a moment 7709 07:25:31,462 --> 07:25:34,227 I'm going to fix this one last\nremnant of the previous version. 7710 07:25:34,227 --> 07:25:36,310 And if I can call back on\nyou again, what should I 7711 07:25:36,311 --> 07:25:39,191 change numbers bracket i to this time? 7712 07:25:39,190 --> 07:25:45,120 AUDIENCE: [INAUDIBLE] dot number. 7713 07:25:45,120 --> 07:25:46,751 DAVID J. MALAN: Dot number, exactly. 7714 07:25:46,751 --> 07:25:49,300 So gone is the honor\nsystem that just assumes 7715 07:25:49,300 --> 07:25:52,420 that bracket i in this array lines up\n 7716 07:25:54,791 --> 07:25:56,351 It's an array called people. 7717 07:25:56,350 --> 07:25:58,420 The things it stores are persons. 7718 07:25:58,420 --> 07:26:00,161 A person has a name and a number. 7719 07:26:00,161 --> 07:26:02,578 And so even though it's kind\nof marginal admittedly given 7720 07:26:02,578 --> 07:26:05,411 that this is a short program and\n 7721 07:26:05,411 --> 07:26:07,661 look more complicated\nat first glance, we're 7722 07:26:07,661 --> 07:26:10,901 now laying the foundation for just\n 7723 07:26:10,901 --> 07:26:13,541 can't screw up now the\nassociation of names 7724 07:26:13,541 --> 07:26:17,140 with numbers because every person's\n 7725 07:26:17,140 --> 07:26:21,070 encapsulated inside\nof the same data type. 7726 07:26:21,070 --> 07:26:22,631 And that's a term of art in CS. 7727 07:26:22,631 --> 07:26:26,081 Encapsulation means to\nencapsulate-- that is, contain-- 7728 07:26:26,080 --> 07:26:28,190 related pieces of information. 7729 07:26:28,190 --> 07:26:34,030 And thus, we have a person that\n 7730 07:26:34,811 --> 07:26:36,671 And this just sets\nthe foundation for all 7731 07:26:36,670 --> 07:26:39,550 of the cool stuff we've talked\nabout and you use every day. 7732 07:26:40,353 --> 07:26:43,271 Well, recall that an image is a bunch\n 7733 07:26:43,271 --> 07:26:46,931 Every one of those dots\nhas RGB values associated 7734 07:26:46,931 --> 07:26:48,791 with it-- red, green, and blue. 7735 07:26:48,791 --> 07:26:52,121 You could imagine now creating\na structure in C probably where 7736 07:26:52,120 --> 07:26:55,900 maybe you have three values,\nthree variables-- one called red 7737 07:26:55,901 --> 07:26:57,761 one called green, one called blue. 7738 07:26:57,760 --> 07:27:00,341 And then you could name the\nthing not person but pixel. 7739 07:27:00,341 --> 07:27:04,271 And now you could store in C three\n 7740 07:27:04,271 --> 07:27:08,338 some green, some blue-- and collectively\n 7741 07:27:08,338 --> 07:27:11,381 And you could imagine doing something\n 7742 07:27:11,381 --> 07:27:14,800 Music, you might have three\n 7743 07:27:14,800 --> 07:27:17,021 the duration, the loudness of it. 7744 07:27:17,021 --> 07:27:20,480 And you can imagine coming up with\n 7745 07:27:20,480 --> 07:27:21,730 So this is a little low level. 7746 07:27:21,730 --> 07:27:24,280 We're just using like a\nfamiliar contacts application. 7747 07:27:24,280 --> 07:27:28,630 But we now have the way in code\nto express most any type of data 7748 07:27:28,631 --> 07:27:32,621 that we might want to implement\nor discuss ultimately. 7749 07:27:32,620 --> 07:27:37,870 So any questions now on struct\nor defining our own types 7750 07:27:37,870 --> 07:27:42,470 the purposes for which are to use\n 7751 07:27:42,471 --> 07:27:45,640 now in a better design but\nalso to lay the foundation 7752 07:27:45,640 --> 07:27:50,240 for implementing cooler and cooler\n 7753 07:27:50,741 --> 07:27:52,074 AUDIENCE: What's the [INAUDIBLE] 7754 07:27:52,074 --> 07:27:55,074 DAVID J. MALAN: What's the difference\n 7755 07:27:55,911 --> 07:27:58,751 So slight side note, C\nis not object-oriented. 7756 07:27:58,751 --> 07:28:02,350 Languages like Java and C++ and\n 7757 07:28:02,350 --> 07:28:05,560 of, programmed yourself, had friends\n 7758 07:28:05,561 --> 07:28:09,791 languages in those languages they have\n 7759 07:28:10,811 --> 07:28:14,381 And objects can store not\njust data, like variables. 7760 07:28:14,381 --> 07:28:18,851 Objects can also store functions, and\n 7761 07:28:18,850 --> 07:28:20,740 But it's not sort of conventional. 7762 07:28:20,741 --> 07:28:24,131 In C, you have data\nstructures that store data. 7763 07:28:24,131 --> 07:28:29,140 In languages like Java and C+, you have\n 7764 07:28:29,771 --> 07:28:32,151 Python is an object-oriented\nlanguage as well. 7765 07:28:32,151 --> 07:28:35,631 So we'll see this issue in a few weeks,\n 7766 07:28:36,131 --> 07:28:38,116 AUDIENCE: Could you\nuse this [INAUDIBLE]?? 7767 07:28:38,741 --> 07:28:41,381 Could you use this struct to\nredefine how an int is defined? 7768 07:28:42,611 --> 07:28:46,061 We talked a couple of times\nnow about integer overflow. 7769 07:28:46,061 --> 07:28:50,261 And most recently, you might have seen\n 7770 07:28:50,260 --> 07:28:52,841 that was literally related\nto an int overflow. 7771 07:28:52,841 --> 07:28:57,251 That's the result of ints only\nstoring 4 bytes or 32 bits 7772 07:28:57,251 --> 07:29:00,161 or even as long as 64 bits or 8 bytes. 7773 07:29:01,241 --> 07:29:03,881 But if you want to implement\nsome financial software 7774 07:29:03,881 --> 07:29:06,461 or some scientific or\nmathematical software that 7775 07:29:06,460 --> 07:29:10,330 allows you to count way bigger\nthan a typical int or a long 7776 07:29:10,330 --> 07:29:13,495 you could imagine John coming\nup with your own structure. 7777 07:29:13,495 --> 07:29:15,620 And in fact, in some\nlanguages there is a structure 7778 07:29:15,620 --> 07:29:19,730 called big int, which allows you\nto express even bigger numbers. 7779 07:29:20,330 --> 07:29:24,770 Well, maybe you store inside of\na big ant an array of values. 7780 07:29:24,771 --> 07:29:27,852 And you somehow allow yourself\nto store more and more bits 7781 07:29:27,852 --> 07:29:29,810 based on how high you\nwant to be able to count. 7782 07:29:30,830 --> 07:29:34,200 We now have the ability now to do\n 7783 07:29:34,201 --> 07:29:36,561 even if it's not built in for us. 7784 07:29:43,122 --> 07:29:45,830 DAVID J. MALAN: Could you define a name\n 7785 07:29:46,430 --> 07:29:48,347 It starts to get\nsyntactically a little messy 7786 07:29:48,347 --> 07:29:51,590 so I did it a little more\npedantic line by line. 7787 07:29:52,791 --> 07:29:57,271 AUDIENCE: [INAUDIBLE] function\nyou use for the function 7788 07:29:57,271 --> 07:29:59,701 at the bottom of the [INAUDIBLE]. 7789 07:29:59,701 --> 07:30:03,389 Could you do something\nlike that [INAUDIBLE]?? 7790 07:30:03,389 --> 07:30:05,759 DAVID J. MALAN: Prototypes--\nyou have to do A and C. You 7791 07:30:05,759 --> 07:30:09,389 have to define anything you're going\n 7792 07:30:09,389 --> 07:30:11,139 to use before you actually use it. 7793 07:30:11,139 --> 07:30:15,190 So it is deliberate that I put it\n 7794 07:30:15,190 --> 07:30:19,230 Otherwise, the compiler would not know\n 7795 07:30:19,230 --> 07:30:22,111 use it here on what's line 14. 7796 07:30:22,111 --> 07:30:25,840 So it has to come first, or it has to\n 7797 07:30:25,840 --> 07:30:29,340 so that you include it at\nthe very top of your code. 7798 07:30:37,643 --> 07:30:39,350 DAVID J. MALAN: Yeah, good\nquestion, and we'll 7799 07:30:39,350 --> 07:30:42,860 come back to this later in the term when\n 7800 07:30:42,861 --> 07:30:44,991 and storing things in actual databases. 7801 07:30:44,991 --> 07:30:48,681 Generally speaking, even though we\n 7802 07:30:48,681 --> 07:30:52,161 or in the US, we have social security\n 7803 07:30:52,161 --> 07:30:56,420 often have other punctuation in it,\n 7804 07:30:57,741 --> 07:31:01,701 You could not store any of that syntax\n 7805 07:31:01,701 --> 07:31:03,241 You could only store numbers. 7806 07:31:03,241 --> 07:31:05,301 So one motivation for\nusing a string is just 7807 07:31:05,300 --> 07:31:08,931 I can store whatever the human wanted\n 7808 07:31:10,050 --> 07:31:13,789 Another reason for\nstoring things as strings 7809 07:31:13,789 --> 07:31:15,831 even if they look like\nnumbers, is in the context 7810 07:31:15,830 --> 07:31:17,448 of zip codes in the United States. 7811 07:31:17,449 --> 07:31:18,741 Again, we'll come back to this. 7812 07:31:18,741 --> 07:31:21,051 But long story short--\nyears ago, actually-- 7813 07:31:21,050 --> 07:31:24,093 I was using Microsoft\nOutlook for my email client. 7814 07:31:24,093 --> 07:31:25,550 And eventually I switched to Gmail. 7815 07:31:25,550 --> 07:31:27,260 And this is like 10 plus years ago now. 7816 07:31:27,260 --> 07:31:31,790 And Outlook at the time lets you export\n 7817 07:31:33,140 --> 07:31:34,971 More on that in the weeks to come too. 7818 07:31:34,971 --> 07:31:36,763 And that just means I\ncould download a text 7819 07:31:36,762 --> 07:31:40,070 file with all of my friends and\n 7820 07:31:40,070 --> 07:31:44,172 Unfortunately, I open that same CSV\n 7821 07:31:44,172 --> 07:31:46,130 just to kind of spot\ncheck it and see if what's 7822 07:31:46,131 --> 07:31:47,761 in there was what it was expected. 7823 07:31:47,760 --> 07:31:51,230 And I must have instinctively hit,\n 7824 07:31:51,230 --> 07:31:54,260 And Excel at least has this habit\n 7825 07:31:54,260 --> 07:31:56,990 If things look like numbers,\nit treats them as numbers. 7826 07:31:56,991 --> 07:31:58,401 And Apple Numbers does this too. 7827 07:31:58,401 --> 07:32:00,441 Google Spreadsheets\ndoes this to nowadays. 7828 07:32:00,440 --> 07:32:07,400 But long story short, I then imported\n 7829 07:32:07,401 --> 07:32:11,121 And now 10 plus years later, I'm still\n 7830 07:32:11,120 --> 07:32:17,001 members whose zip codes are in\nCambridge, Massachusetts 2138 7831 07:32:17,001 --> 07:32:20,661 which is missing the 0 because\nwe here in Cambridge are 02138. 7832 07:32:20,661 --> 07:32:23,780 And that's because I\ntreated or I let Excel 7833 07:32:23,780 --> 07:32:26,990 treat what looks like a number\nas an actual number or int 7834 07:32:26,991 --> 07:32:29,901 and now leading zeros become a\n 7835 07:32:29,901 --> 07:32:33,261 mean nothing, but in the\nmail system, they do-- 7836 07:32:34,431 --> 07:32:36,013 All right, other final questions here. 7837 07:32:39,021 --> 07:32:42,861 DAVID J. MALAN: Yeah, so could I have\n 7838 07:32:42,861 --> 07:32:47,091 array to solve the problem\nearlier of having just one array? 7839 07:32:47,091 --> 07:32:51,111 Yes, but one, I would argue\nit's less readable, especially 7840 07:32:51,111 --> 07:32:53,001 as I get lots of names and numbers. 7841 07:32:53,001 --> 07:32:56,091 And two, that too is also kind\nof relying on the honor system. 7842 07:32:56,091 --> 07:32:59,300 It would be all too easy to omit some\n 7843 07:33:00,300 --> 07:33:04,370 So I would argue it too is not\nas good as introducing a struct. 7844 07:33:05,541 --> 07:33:10,431 Two dimensional arrays just means\n 7845 07:33:10,431 --> 07:33:12,440 All right, so now that\nwe have this ability 7846 07:33:12,440 --> 07:33:16,570 to store different types of data\nlike contacts in a phone book 7847 07:33:16,570 --> 07:33:18,320 having names and\naddresses, let's actually 7848 07:33:18,320 --> 07:33:21,140 take a step back and\nconsider how we might now 7849 07:33:21,140 --> 07:33:26,210 solve one of the original problems by\n 7850 07:33:26,210 --> 07:33:30,291 given in advance and considering,\n 7851 07:33:30,291 --> 07:33:33,261 costly, how time consuming is\nthat because that might tip 7852 07:33:33,260 --> 07:33:37,370 the scales in favor of sorting,\nthen searching, or maybe just 7853 07:33:37,370 --> 07:33:39,320 not sorting and only searching. 7854 07:33:39,320 --> 07:33:42,830 It'll give us a sense of just\nhow expensive, so to speak 7855 07:33:42,830 --> 07:33:44,705 sorting something actually is. 7856 07:33:44,705 --> 07:33:46,580 Well, what's the\nformulation of this problem? 7857 07:33:46,580 --> 07:33:48,140 It's the same thing as week zero. 7858 07:33:49,311 --> 07:33:51,480 We want it to be output as sorted. 7859 07:33:51,480 --> 07:33:54,920 So for instance, if we're\ntaking unsorted input as input 7860 07:33:54,920 --> 07:33:58,255 we want the sorted output as\nthe result. More concretely 7861 07:33:58,256 --> 07:33:59,631 if we've got numbers like these-- 7862 07:33:59,631 --> 07:34:04,461 63852741, which are just\nrandomly arranged numbers-- 7863 07:34:04,460 --> 07:34:08,870 we want to get back out 12345678. 7864 07:34:08,870 --> 07:34:10,800 So we just want those\nthings to be sorted. 7865 07:34:10,800 --> 07:34:12,830 So again, inside of\nthe black box here is 7866 07:34:12,830 --> 07:34:17,820 going to be one or more algorithms\n 7867 07:34:17,820 --> 07:34:20,041 So how might we go about doing this? 7868 07:34:20,041 --> 07:34:23,541 Well, just to vary things a bit\n 7869 07:34:23,541 --> 07:34:25,941 for a bit more audience participation. 7870 07:34:25,940 --> 07:34:28,150 But this time, we need\neight people if we may. 7871 07:34:28,151 --> 07:34:30,651 All of you have to be comfortable\nappearing on the internet. 7872 07:34:30,651 --> 07:34:33,526 OK, so this is actually quite\n 7873 07:34:33,526 --> 07:34:37,113 How about 1, 2, 3, 4, 5, 6, 7-- 7874 07:34:37,113 --> 07:34:41,421 oh, OK, and someone volunteering\ntheir friend-- number eight. 7875 07:34:43,236 --> 07:34:45,111 And if you could, I'm\ngoing to set things up. 7876 07:34:45,111 --> 07:34:47,991 If you all could join Valerie,\nmy colleague over there 7877 07:34:47,991 --> 07:34:53,271 to give you a prop to use here,\nwe'll go ahead in just a moment 7878 07:34:53,271 --> 07:34:56,195 and try to find some numbers at hand. 7879 07:34:59,541 --> 07:35:05,181 In just a moment, each of our volunteers\n 7880 07:35:05,181 --> 07:35:09,741 And that integer is initially\ngoing to be in unsorted order. 7881 07:35:09,741 --> 07:35:13,461 And I claim that using an algorithm,\nstep by step instructions 7882 07:35:13,460 --> 07:35:18,181 we can probably sort these folks in\n 7883 07:35:18,181 --> 07:35:22,791 So they're in wardrobe right now just\n 7884 07:35:22,791 --> 07:35:27,306 with a Jersey number on it, which will\n 7885 07:35:31,010 --> 07:35:35,630 Give us just a moment to finish\ngetting the attire ready. 7886 07:35:35,631 --> 07:35:40,401 They're being handed\na shirt and a number. 7887 07:35:40,401 --> 07:35:42,981 And let me ask the\naudience for just a moment. 7888 07:35:42,980 --> 07:35:47,120 As we have these numbers up here on the\n 7889 07:35:47,120 --> 07:35:48,713 They're just in random order. 7890 07:35:48,713 --> 07:35:49,881 And let me ask the audience. 7891 07:35:49,881 --> 07:35:55,341 How would you go about sorting\n 7892 07:35:55,341 --> 07:35:57,291 How would you go about sorting these? 7893 07:35:57,291 --> 07:35:58,498 Yeah, what are your thoughts? 7894 07:35:58,498 --> 07:36:04,687 AUDIENCE: [INAUDIBLE] the number\n 7895 07:36:05,271 --> 07:36:08,907 AUDIENCE: The following number is\n 7896 07:36:09,491 --> 07:36:11,161 AUDIENCE: If not, then [INAUDIBLE]. 7897 07:36:11,161 --> 07:36:13,712 DAVID J. MALAN: OK, so just\nto recap, you would start 7898 07:36:13,712 --> 07:36:15,170 with one of the numbers on the end. 7899 07:36:15,170 --> 07:36:17,570 You would look to the number to\nthe right or to the left of it 7900 07:36:17,570 --> 07:36:18,890 depending on which end you start at. 7901 07:36:18,890 --> 07:36:21,473 And if it's out of order, you\nwould just start to swap things. 7902 07:36:22,710 --> 07:36:24,795 There's a whole bunch\nof mistakes to fix here 7903 07:36:24,795 --> 07:36:26,420 because things are pretty out of order. 7904 07:36:26,420 --> 07:36:29,420 But probably, if you start to\nsolve small problems at a time 7905 07:36:29,420 --> 07:36:32,270 you can achieve the end result of\n 7906 07:36:32,271 --> 07:36:35,181 Other instincts, if you were\njust handed these numbers, how 7907 07:36:35,181 --> 07:36:38,438 you might go about sorting them? 7908 07:36:44,501 --> 07:36:46,280 DAVID J. MALAN: OK, I like that. 7909 07:36:46,280 --> 07:36:50,200 So to recap there, find the smallest\n 7910 07:36:51,431 --> 07:36:54,703 And then presumably, you could do\n 7911 07:36:54,703 --> 07:36:57,411 And that would seem to give you\n 7912 07:36:57,411 --> 07:37:00,070 And if you all are attired here-- 7913 07:37:00,070 --> 07:37:03,161 do you want to come\non up if you're ready? 7914 07:37:03,161 --> 07:37:05,021 We had some [? felt ?] volunteers too. 7915 07:37:07,390 --> 07:37:09,881 So if you all would like\nto line yourselves up 7916 07:37:09,881 --> 07:37:12,131 facing the audience in\nexactly this order-- so 7917 07:37:12,131 --> 07:37:14,531 whoever is number zero\nshould be way over here 7918 07:37:14,530 --> 07:37:18,070 and whoever is number five\nshould be way over there. 7919 07:37:18,070 --> 07:37:21,280 Feel free to distance as much as you'd\n 7920 07:37:24,013 --> 07:37:25,181 And make a little more room. 7921 07:37:29,451 --> 07:37:31,094 DAVID J. MALAN: 4, hopefully 1. 7922 07:37:31,094 --> 07:37:32,261 Yeah, keep them to the side. 7923 07:37:38,050 --> 07:37:41,435 All right, so here, we have\nan array of eight numbers-- 7924 07:37:41,436 --> 07:37:42,561 eight integers if you will. 7925 07:37:42,561 --> 07:37:45,131 And do you want to each say\na quick hello to the group? 7926 07:38:05,080 --> 07:38:07,265 AUDIENCE: Hi, I'm\nCeleste, and go Strauss. 7927 07:38:08,140 --> 07:38:11,291 Well, welcome all to the stage,\nand let's just visualize 7928 07:38:11,291 --> 07:38:13,791 perhaps organically, how you\neight would solve this problem. 7929 07:38:13,791 --> 07:38:16,901 So we currently have the numbers\n0 through 7 quite out of order. 7930 07:38:16,901 --> 07:38:20,869 Could you go ahead and just\nyourselves from 0 through 7? 7931 07:38:25,760 --> 07:38:28,560 DAVID J. MALAN: OK, so what did they just do? 7932 07:38:29,061 --> 07:38:30,621 First of all, yes, very well done. 7933 07:38:34,390 --> 07:38:37,398 How would you describe\nwhat they just did? 7934 07:38:38,230 --> 07:38:40,420 Could you go back into\nthat order on the screen-- 7935 07:38:44,681 --> 07:38:47,800 And could you do exactly\nwhat you just did again? 7936 07:38:58,550 --> 07:39:01,841 All right, so admittedly, there's kind\n 7937 07:39:01,841 --> 07:39:05,510 except number four, are doing something\n 7938 07:39:05,510 --> 07:39:07,751 And that's not really how\na computer typically works. 7939 07:39:07,751 --> 07:39:11,260 Just like a computer can only look at\n 7940 07:39:11,260 --> 07:39:15,850 at a time, so can a computer only move\n 7941 07:39:15,850 --> 07:39:18,077 a locker, checking what's\nthere, moving it as needed. 7942 07:39:18,078 --> 07:39:21,161 So let's try this more methodically\n 7943 07:39:21,161 --> 07:39:26,710 If you all could randomize\nyourself again to 52741630 7944 07:39:26,710 --> 07:39:29,122 let's take the second of\nthose approaches first. 7945 07:39:29,122 --> 07:39:30,580 I'm going to look at these numbers. 7946 07:39:30,580 --> 07:39:33,230 And even though I as the human\ncan obviously see all the numbers 7947 07:39:33,230 --> 07:39:35,291 and I just kind of have the\nintuition for how to fix this 7948 07:39:35,291 --> 07:39:37,541 we got to be more methodical\nbecause eventually, we've 7949 07:39:37,541 --> 07:39:39,861 got to translate this to\npseudo code and then code. 7950 07:39:40,751 --> 07:39:43,561 I'm going to search for, as you\nproposed, the smallest number. 7951 07:39:43,561 --> 07:39:45,311 And I'm going to start\nfrom left to right. 7952 07:39:45,311 --> 07:39:48,461 I could do it right to left, but left\n 7953 07:39:48,460 --> 07:39:51,440 All right, 5 at this moment is\nthe smallest number I've seen. 7954 07:39:51,440 --> 07:39:54,179 So I'm going to remember that\nin a variable, if you will. 7955 07:39:54,179 --> 07:39:55,721 Now I'm going to take one more step-- 7956 07:39:56,291 --> 07:39:59,621 OK, 2 I'm going to compare to the\n 7957 07:39:59,620 --> 07:40:03,521 I'm going to forget about 5 and only\n 7958 07:40:04,181 --> 07:40:07,451 7, nope-- I'm going to ignore that\n 7959 07:40:08,170 --> 07:40:11,530 4, 1-- OK, I'm going to\nupdate the variable in mind 7960 07:40:11,530 --> 07:40:12,790 because that's indeed smaller. 7961 07:40:12,791 --> 07:40:15,491 Now obviously, we the humans\nknow that's getting pretty small. 7962 07:40:16,541 --> 07:40:19,991 I have to check all values to see\n 7963 07:40:19,991 --> 07:40:22,796 because 6 is not, 3 is not, but 0 is. 7964 07:40:22,795 --> 07:40:23,920 And what's your name again? 7965 07:40:25,631 --> 07:40:32,320 Where should Celeste or number 0 go\n 7966 07:40:32,320 --> 07:40:33,951 All right, I'm seeing a lot of this. 7967 07:40:33,951 --> 07:40:37,181 So at the beginning of the array,\nso before doing this for real 7968 07:40:37,181 --> 07:40:38,921 let's have you pop out in front. 7969 07:40:38,920 --> 07:40:42,460 And could you all shift\nand make room for Celeste? 7970 07:40:42,460 --> 07:40:46,390 Is this a good idea to have all\nof them move or equivalently 7971 07:40:46,390 --> 07:40:48,911 move everything in the array\nto make room for Celeste 7972 07:40:52,030 --> 07:40:53,238 That felt like a lot of work. 7973 07:40:53,239 --> 07:40:56,561 And even though it happened pretty\n 7974 07:40:56,561 --> 07:40:58,401 to happen just to move her in place. 7975 07:40:58,401 --> 07:41:00,941 So what would be marginally\nsmarter perhaps-- 7976 07:41:00,940 --> 07:41:03,320 a little more efficient, perhaps? 7977 07:41:07,414 --> 07:41:08,831 DAVID J. MALAN: OK, replace two values. 7978 07:41:08,830 --> 07:41:12,450 So if you want to go back to where\n 7979 07:41:12,451 --> 07:41:13,921 he's not in the right place. 7980 07:41:13,920 --> 07:41:15,150 He's got to move eventually. 7981 07:41:15,859 --> 07:41:18,661 If that's where Celeste belongs,\nwhy don't we just swap 5 and 0? 7982 07:41:18,661 --> 07:41:21,286 So if you want to go ahead and\nexchange places with each other. 7983 07:41:21,286 --> 07:41:22,591 Notice what's just happened. 7984 07:41:22,591 --> 07:41:25,780 The problem I'm trying to\nsolve has gotten smaller. 7985 07:41:25,780 --> 07:41:28,380 Instead of being size\n8, now it's size 7. 7986 07:41:28,381 --> 07:41:31,411 Now granted, I moved 5 to\nanother wrong location. 7987 07:41:31,411 --> 07:41:33,300 But if these numbers\nstarted off randomly 7988 07:41:33,300 --> 07:41:37,150 it doesn't really matter where 5 goes\n 7989 07:41:38,401 --> 07:41:41,791 And now if I go back, my loop\nis sort of coming back around. 7990 07:41:41,791 --> 07:41:46,050 I can ignore Celeste and make this\n 7991 07:41:46,050 --> 07:41:47,791 because I know she's in the right place. 7992 07:41:53,431 --> 07:41:57,960 Now I know as a human this\nshould be my next smallest. 7993 07:41:57,960 --> 07:42:02,460 But why, intuitively, should\nI keep going, do you think? 7994 07:42:02,460 --> 07:42:05,310 I can't sort of optimize as a\nhuman and just say, number 1 7995 07:42:05,311 --> 07:42:07,121 let's get you into the right place. 7996 07:42:07,120 --> 07:42:08,970 I still want to check the whole array. 7997 07:42:10,230 --> 07:42:12,837 AUDIENCE: Perhaps there's another 1. 7998 07:42:12,837 --> 07:42:14,670 DAVID J. MALAN: Maybe there's\nanother 1, and that 7999 07:42:14,670 --> 07:42:16,390 could be another problem altogether. 8000 07:42:17,631 --> 07:42:18,818 AUDIENCE: Could be another 0 8001 07:42:18,818 --> 07:42:20,611 DAVID J. MALAN: There could\nbe another 0 indeed 8002 07:42:20,611 --> 07:42:22,921 but I did go through\nthe list once, right? 8003 07:42:22,920 --> 07:42:24,330 And I kind of know there isn't. 8004 07:42:24,931 --> 07:42:27,661 AUDIENCE: You don't know that\nevery value is represented. 8005 07:42:27,661 --> 07:42:31,747 So maybe there's a [INAUDIBLE] You\n 8006 07:42:32,580 --> 07:42:34,960 DAVID J. MALAN: Yeah, I don't\nnecessarily know what is there. 8007 07:42:34,960 --> 07:42:39,300 And honestly, I only stipulated earlier\n 8008 07:42:39,300 --> 07:42:42,541 I could use two and remember the\n 8009 07:42:42,541 --> 07:42:44,491 I could use three variables, four. 8010 07:42:44,491 --> 07:42:47,801 But then I'm going to start to use\n 8011 07:42:47,800 --> 07:42:51,240 So if I've stipulated that I only have\n 8012 07:42:51,241 --> 07:42:53,551 I don't know anything\nmore about these elements 8013 07:42:53,550 --> 07:42:55,758 because the only thing I'm\nremembering at this moment 8014 07:42:55,758 --> 07:42:57,850 is number 1 is the\nsmallest element I've seen. 8015 07:43:01,070 --> 07:43:03,030 OK, I know that number\n1, and your name was-- 8016 07:43:03,739 --> 07:43:06,211 DAVID J. MALAN: --Hannah is\nthe next smallest element. 8017 07:43:06,210 --> 07:43:08,701 I could have everyone move\nover to make room, but nope. 8018 07:43:09,420 --> 07:43:11,340 You know, even though you're\nso close to where I want you 8019 07:43:11,341 --> 07:43:13,530 I'm just going to keep it\nsimple and swap you two. 8020 07:43:13,530 --> 07:43:15,931 So granted, I've made the\nproblem a little worse. 8021 07:43:15,931 --> 07:43:19,561 But on average, I could get\nlucky too and just pop number 2 8022 07:43:20,640 --> 07:43:22,570 Now let me just accelerate this. 8023 07:43:22,570 --> 07:43:26,911 I can now ignore Hannah and Celeste,\n 8024 07:43:34,830 --> 07:43:37,920 So let's go ahead and swap 2 and 7. 8025 07:43:37,920 --> 07:43:40,590 And now I'll just kind of\norchestrate it verbally. 8026 07:43:40,591 --> 07:43:42,311 4, you're about to have to do something. 8027 07:43:46,111 --> 07:43:48,841 OK, 3-- could you swap with 4? 8028 07:43:48,841 --> 07:43:52,036 All right, now we have 7, 6, 4, 5. 8029 07:43:52,036 --> 07:43:54,780 OK, 4, could you swap with 7? 8030 07:44:02,521 --> 07:44:04,050 And now perhaps round of applause. 8031 07:44:05,341 --> 07:44:09,611 OK, hang on there one minute. 8032 07:44:09,611 --> 07:44:11,491 So we'll do this one other approach. 8033 07:44:11,491 --> 07:44:14,581 And my God, that felt so much\nslower than the first approach 8034 07:44:14,580 --> 07:44:17,400 but that's, one, because I was\n 8035 07:44:17,401 --> 07:44:22,051 But two, we were doing one thing at a\n 8036 07:44:22,050 --> 07:44:25,470 had the luxury of moving\nlike eight different CPUs-- 8037 07:44:25,471 --> 07:44:28,051 brains, if you will-- were all\noperating at the same time. 8038 07:44:28,050 --> 07:44:29,501 And computers like that exist. 8039 07:44:29,501 --> 07:44:32,254 If you have a computer with\nmultiple cores, so to speak 8040 07:44:32,254 --> 07:44:34,171 that's like having a\ncomputer that technically 8041 07:44:34,170 --> 07:44:35,850 can do multiple things at once. 8042 07:44:35,850 --> 07:44:38,830 But software typically, at least\nas we've written it thus far 8043 07:44:38,830 --> 07:44:40,555 can only do one thing at a time. 8044 07:44:40,556 --> 07:44:42,431 So in a bit, we'll add\nup all of these steps. 8045 07:44:42,431 --> 07:44:44,223 But for now, let's take\none other approach. 8046 07:44:44,223 --> 07:44:46,501 If you all could reorder\nyourselves like that-- 8047 07:44:46,501 --> 07:44:51,300 52741630-- let's take\nthe other approach that 8048 07:44:51,300 --> 07:44:54,970 was recommended by just fixing small\n 8049 07:44:54,971 --> 07:44:57,091 So we're back in the original order. 8050 07:44:57,091 --> 07:44:59,172 5 and 2 are clearly out of order. 8051 07:44:59,881 --> 07:45:01,711 Let's just bite this problem off now. 8052 07:45:03,451 --> 07:45:04,890 Now let me take a next step. 8053 07:45:06,690 --> 07:45:09,360 There's a gap, yes, but that\nmight not be a big deal. 8054 07:45:12,780 --> 07:45:15,510 OK, 7 and 1, let's have you swap. 8055 07:45:15,510 --> 07:45:18,480 7 and 6, let's have you swap. 8056 07:45:22,050 --> 07:45:23,850 Now let me pause for just a moment. 8057 07:45:26,830 --> 07:45:29,700 But have I improved the problem? 8058 07:45:29,701 --> 07:45:32,460 Right, I can't see-- like\nbefore, I can't optimize like 8059 07:45:32,460 --> 07:45:34,570 before because 0 is obviously not here. 8060 07:45:34,570 --> 07:45:38,521 So unless they're still way back there,\n 8061 07:45:40,661 --> 07:45:42,407 But have I made any improvements? 8062 07:45:43,616 --> 07:45:45,841 In what sense is this improved? 8063 07:45:45,841 --> 07:45:50,411 What's a concrete thing you\ncould point to is better? 8064 07:45:50,911 --> 07:45:52,471 AUDIENCE: Sorted the highest number. 8065 07:45:52,471 --> 07:45:55,051 DAVID J. MALAN: I've sorted the\n 8066 07:45:55,050 --> 07:45:59,760 And conversely, if you prefer, Celeste\n 8067 07:45:59,760 --> 07:46:04,330 Now worst case, Celeste is going to\n 8068 07:46:04,330 --> 07:46:06,900 So I might need to do this\nthing like n total times 8069 07:46:06,901 --> 07:46:08,576 to move her all the way over. 8070 07:46:08,576 --> 07:46:09,701 But that might work out OK. 8071 07:46:23,741 --> 07:46:25,871 notice that the high\nvalues, as you noted 8072 07:46:25,870 --> 07:46:28,510 are sort of bubbling up, if you\nwill, to the end of the list. 8073 07:46:36,140 --> 07:46:37,730 5, 6, 7, of course, are good. 8074 07:46:37,730 --> 07:46:40,536 So now you can sort of see\nthe problem resolving itself. 8075 07:46:40,536 --> 07:46:42,161 And let's just do this part now faster. 8076 07:46:48,521 --> 07:46:53,238 All right, now 1 and 2,\n2, and 3, and 0, and good. 8077 07:46:53,238 --> 07:46:54,820 So we do have some optimization there. 8078 07:46:54,820 --> 07:46:57,221 We don't need to keep going\nbecause those all are sorted. 8079 07:47:01,091 --> 07:47:04,690 1 and 0-- and big round\nof applause in closing. 8080 07:47:09,251 --> 07:47:11,501 We need the puppets back,\nbut you can keep the shirts. 8081 07:47:11,501 --> 07:47:13,201 Thank you for volunteering here. 8082 07:47:13,201 --> 07:47:16,171 Feel free to make your\nway exits left or right. 8083 07:47:16,170 --> 07:47:18,120 And let's see if,\nthanks to our volunteers 8084 07:47:18,120 --> 07:47:24,361 here, we can't now formalize a little\n 8085 07:47:24,361 --> 07:47:28,530 I claim that the first algorithm\nour volunteers kindly acted out 8086 07:47:28,530 --> 07:47:30,091 is what's called selection sort. 8087 07:47:30,091 --> 07:47:35,341 And as the name implied, we selected\n 8088 07:47:35,341 --> 07:47:37,890 and again, working our\nway from left to right 8089 07:47:37,890 --> 07:47:42,460 putting Celeste into the right place,\n 8090 07:47:42,460 --> 07:47:45,420 So selection sort, as\nit's formally called 8091 07:47:45,420 --> 07:47:48,360 can be described, for instance,\nwith this pseudo code here-- 8092 07:47:52,710 --> 07:47:54,780 This is just how talk about arrays. 8093 07:47:54,780 --> 07:47:59,280 The left end is 0, the right end\n 8094 07:47:59,280 --> 07:48:01,080 n happened to be eight people. 8095 07:48:03,061 --> 07:48:06,601 So for i from 0 to n\nminus 1, what did I do? 8096 07:48:06,600 --> 07:48:11,760 I found the smallest number between\n 8097 07:48:13,170 --> 07:48:15,390 It's a little cryptic at\nfirst glance, but this 8098 07:48:15,390 --> 07:48:18,870 is just a very pseudo\ncode-like way of saying 8099 07:48:18,870 --> 07:48:22,320 find the smallest element\namong all eight volunteers 8100 07:48:22,320 --> 07:48:27,480 because if i starts at 0 and n minus\n 8101 07:48:27,480 --> 07:48:31,591 8, 8 people, so 8 minus\n1 is 7, this first 8102 07:48:31,591 --> 07:48:34,681 says find the smallest number\nbetween numbers bracket 0 8103 07:48:34,681 --> 07:48:37,710 and numbers bracket 7, if you will. 8104 07:48:38,911 --> 07:48:42,091 Swap the smallest number\nwith numbers bracket i. 8105 07:48:42,091 --> 07:48:45,570 So that's how we got Celeste from\n 8106 07:48:45,570 --> 07:48:47,701 We just swapped those two values. 8107 07:48:47,701 --> 07:48:50,201 What then happens next\nin this pseudo code? 8108 07:48:50,201 --> 07:48:52,261 i, of course, goes from 0 to 1. 8109 07:48:52,260 --> 07:48:54,420 And that's the technical\nway of saying now 8110 07:48:54,420 --> 07:48:58,170 find the smallest element among\nthe 7 remaining volunteers 8111 07:48:58,170 --> 07:49:01,930 ignoring Celeste this time because she\n 8112 07:49:01,931 --> 07:49:04,320 So the problem went\nfrom size 8 to size 7. 8113 07:49:04,320 --> 07:49:07,861 And if we repeat, size 6,\n5, 4, 3, 2, 1, until boom 8114 07:49:07,861 --> 07:49:10,181 it's all done at the very end. 8115 07:49:10,181 --> 07:49:13,561 So this is just one way of\nexpressing in pseudo code what 8116 07:49:13,561 --> 07:49:17,401 we did a little more organically\n 8117 07:49:17,401 --> 07:49:19,781 volunteered out in the audience. 8118 07:49:19,780 --> 07:49:24,661 So if we consider, then, the\nefficiency of this algorithm 8119 07:49:24,661 --> 07:49:27,091 maybe abstracting it away\nnow as a bunch of doors 8120 07:49:27,091 --> 07:49:31,320 where the left most again is always\n 8121 07:49:31,320 --> 07:49:34,710 or equivalently, the second to last\n 8122 07:49:34,710 --> 07:49:38,911 is n minus 3 where n might\nbe 8 or anything else 8123 07:49:38,911 --> 07:49:43,980 how do we think about or quantify\n 8124 07:49:46,591 --> 07:49:49,591 I mean, that was a lot\nof steps to be adding up. 8125 07:49:49,591 --> 07:49:53,491 It's probably more than n, right,\n 8126 07:49:54,390 --> 07:49:59,100 It was like n plus n\nminus 1 plus n minus 2. 8127 07:50:01,440 --> 07:50:04,980 We got like the whole\nteam in the orchestra now. 8128 07:50:04,980 --> 07:50:09,541 Let me propose we think about it this\n 8129 07:50:09,541 --> 07:50:13,710 So the first time, I had to\nlook at n different volunteers. 8130 07:50:13,710 --> 07:50:17,490 n was 8 in this case, but generically,\n 8131 07:50:17,491 --> 07:50:19,591 in order to decide who was the smallest. 8132 07:50:19,591 --> 07:50:21,631 And sure enough, Celeste\nwas at the very end. 8133 07:50:21,631 --> 07:50:23,463 She happened to be all\nthe way to the right. 8134 07:50:23,463 --> 07:50:27,870 But I only knew that once I looked\nat all 8 or all n volunteers. 8135 07:50:27,870 --> 07:50:30,240 So that took me n steps first. 8136 07:50:30,241 --> 07:50:33,631 But once the list was swapped\ninto the right place, then 8137 07:50:33,631 --> 07:50:37,651 my problem with size n minus 1,\nand I had n minus 1 other people 8138 07:50:40,230 --> 07:50:44,186 Then after that, it's n minus 2 plus\n 8139 07:50:44,186 --> 07:50:45,631 dot until I had one final step. 8140 07:50:45,631 --> 07:50:48,820 And it's obvious that I only\nhave one human left to consider. 8141 07:50:48,820 --> 07:50:51,541 So we might wave our hands at\nthis with a little ellipsis 8142 07:50:51,541 --> 07:50:54,761 and just say dot dot dot\nplus 1 for the final step. 8143 07:50:54,760 --> 07:50:56,251 Now what does this actually equal? 8144 07:50:56,251 --> 07:50:57,841 Well, this is where you\nmight think back on, like 8145 07:50:57,841 --> 07:50:59,760 your high school math\nor physics textbook that 8146 07:50:59,760 --> 07:51:03,001 has a little cheat sheet at the end\n 8147 07:51:03,001 --> 07:51:05,850 That happens to work\nout mathematically to be 8148 07:51:05,850 --> 07:51:09,480 n times n plus 1 all divided by 2. 8149 07:51:09,480 --> 07:51:13,050 That's just what that recurrence,\n 8150 07:51:13,050 --> 07:51:15,661 So if you take on faith that\nthat math is correct, let's 8151 07:51:15,661 --> 07:51:19,890 just now multiply this\nout mathematically. 8152 07:51:19,890 --> 07:51:26,280 That's n squared plus n divided by 2 or\n 8153 07:51:26,280 --> 07:51:29,251 And here's where we're starting\n 8154 07:51:29,251 --> 07:51:34,440 Like, honestly, as n gets really\n 8155 07:51:34,440 --> 07:51:39,300 or a billion web pages in Google search\n 8156 07:51:39,300 --> 07:51:41,760 is going to matter the\nmost mathematically 8157 07:51:43,561 --> 07:51:46,201 Is n squared divided by\n2 the dominant factor 8158 07:51:46,201 --> 07:51:48,620 or is n divided by 2\nthe dominant factor? 8159 07:51:50,355 --> 07:51:51,480 DAVID J. MALAN: Yeah, n squared. 8160 07:51:51,480 --> 07:51:53,650 I mean, no matter what n\nis-- and the bigger it is 8161 07:51:53,651 --> 07:51:56,941 the bigger raising it to\nthe power 2 is going to be. 8162 07:51:57,690 --> 07:52:00,630 Let's just wave our hands at this\nbecause at the end of the day 8163 07:52:00,631 --> 07:52:04,140 as n gets really large, the dominant\n 8164 07:52:04,890 --> 07:52:08,650 Even the divided 2, as I claimed earlier\n 8165 07:52:08,651 --> 07:52:11,320 the two straight lines if you\nkeep zooming out essentially 8166 07:52:11,320 --> 07:52:16,120 looked the same when n is large enough,\n 8167 07:52:16,850 --> 07:52:21,490 So that is to say a computer scientist\n 8168 07:52:21,491 --> 07:52:24,221 on the order of n squared steps. 8169 07:52:24,221 --> 07:52:25,871 That's an oversimplification. 8170 07:52:25,870 --> 07:52:28,960 If we really added it up, it's\nactually this many steps-- n 8171 07:52:28,960 --> 07:52:30,970 squared divided by 2 plus n over 2. 8172 07:52:30,971 --> 07:52:34,781 But again, if we want to just be able\n 8173 07:52:34,780 --> 07:52:38,170 performance, I think it's going to\n 8174 07:52:38,170 --> 07:52:44,080 order term to get a sense of what the\n 8175 07:52:44,080 --> 07:52:46,900 or what it even looks like graphically. 8176 07:52:46,901 --> 07:52:50,531 All right, so with that said,\nwe might describe bubble sort 8177 07:52:52,151 --> 07:52:56,061 sorry, selection sort as\nbeing in big O of n squared. 8178 07:52:56,061 --> 07:53:01,391 But what if we consider now the\n 8179 07:53:01,390 --> 07:53:03,431 to talk about a lower bound? 8180 07:53:03,431 --> 07:53:07,666 In the best case, how many\nsteps does selection sort take? 8181 07:53:07,666 --> 07:53:09,041 Well, here, we need some context. 8182 07:53:09,041 --> 07:53:11,681 Like, what does it mean to be\nthe best case or the worst case 8183 07:53:13,451 --> 07:53:16,960 Like, what could you imagine meaning\n 8184 07:53:16,960 --> 07:53:20,011 trying to sort a bunch of numbers? 8185 07:53:20,011 --> 07:53:21,460 I got the whole crew here again. 8186 07:53:21,760 --> 07:53:23,411 AUDIENCE: They would already be sorted. 8187 07:53:23,411 --> 07:53:24,730 DAVID J. MALAN: All right, they're\nalready sorted, right? 8188 07:53:24,730 --> 07:53:28,390 I can't really imagine a better scenario\n 8189 07:53:28,390 --> 07:53:30,400 but they're already sorted for me. 8190 07:53:30,401 --> 07:53:35,531 But does this algorithm\nleverage that fact in practice? 8191 07:53:35,530 --> 07:53:38,470 Even if all of our humans\nhad lined up from 0 to 7 8192 07:53:38,471 --> 07:53:41,291 I'm pretty sure I would have\npretty naively started here. 8193 07:53:41,291 --> 07:53:43,031 And yes, Celeste happens to be here. 8194 07:53:43,030 --> 07:53:47,736 But I only know she needs to be here\n 8195 07:53:47,736 --> 07:53:50,361 And then I would have realized,\nwell, that was a waste of time. 8196 07:53:51,640 --> 07:53:53,861 But then what would I have done? 8197 07:53:53,861 --> 07:53:57,221 I would have ignored her position\n 8198 07:53:57,221 --> 07:54:00,621 I would have done the same thing now\n 8199 07:54:00,620 --> 07:54:03,681 So every time I walk through,\nI'm not doing much useful work. 8200 07:54:03,681 --> 07:54:06,221 But I am doing those\ncomparisons because I 8201 07:54:06,221 --> 07:54:09,861 don't know until I do the work that\n 8202 07:54:09,861 --> 07:54:14,800 So this would seem to imply that\n 8203 07:54:14,800 --> 07:54:18,100 scenario, even, a lower bound on the\n 8204 07:54:20,201 --> 07:54:21,831 DAVID J. MALAN: A little louder? 8205 07:54:22,771 --> 07:54:24,611 DAVID J. MALAN: It's still\ngoing to be n squared 8206 07:54:24,611 --> 07:54:30,181 in fact, because the code I'm giving\n 8207 07:54:30,181 --> 07:54:34,620 from any of that scenario because\nit just mindlessly continues 8208 07:54:36,131 --> 07:54:41,371 So in this case, yes, I would claim that\n 8209 07:54:43,190 --> 07:54:44,940 So those are the kinds\nof numbers to beat. 8210 07:54:44,940 --> 07:54:48,120 It seems like the upper bound\nand lower bound of selection 8211 07:54:50,491 --> 07:54:52,741 And so we can also describe\nselection sort, therefore 8212 07:54:52,741 --> 07:54:54,078 as being in theta of n squared. 8213 07:54:54,078 --> 07:54:56,911 That's the first algorithm we've\n 8214 07:54:56,911 --> 07:54:59,010 which is to say that it's kind of slow. 8215 07:54:59,010 --> 07:55:00,955 I mean, maybe other\nalgorithms are slower 8216 07:55:00,955 --> 07:55:02,580 but this isn't the best starting point. 8217 07:55:03,730 --> 07:55:07,291 Well, there's a reason that I guided us\n 8218 07:55:07,291 --> 07:55:09,791 Even though you verbally proposed\nthem in a different order 8219 07:55:09,791 --> 07:55:12,960 this second algorithm we did is\ngenerally known as bubble sort. 8220 07:55:12,960 --> 07:55:15,570 And I deliberately used\nthat word a bit ago 8221 07:55:15,570 --> 07:55:19,291 saying the big values are\nbubbling their way up to the right 8222 07:55:19,291 --> 07:55:22,351 to kind of capture the fact that,\nindeed, this algorithm works 8223 07:55:22,980 --> 07:55:25,230 But let's consider if\nit's better or worse. 8224 07:55:25,230 --> 07:55:28,320 So here, we have pseudo\ncode for bubble sort. 8225 07:55:28,320 --> 07:55:30,291 You could write this\ntoo in different ways. 8226 07:55:30,291 --> 07:55:32,911 But let's consider what\nwe did on the stage. 8227 07:55:32,911 --> 07:55:36,210 We repeated the following\nn minus 1 times. 8228 07:55:36,210 --> 07:55:39,751 We initialized at least, even though\n 8229 07:55:39,751 --> 07:55:44,791 a variable like i from 0\nto n minus 2, n minus 2. 8230 07:55:44,791 --> 07:55:46,171 And then I asked this question. 8231 07:55:46,170 --> 07:55:52,830 If numbers bracket i and numbers\n 8232 07:55:54,611 --> 07:55:56,941 So again, I just did it more\nintuitively by pointing 8233 07:55:56,940 --> 07:55:59,310 but this would be a way,\nwith a bit of pseudo code 8234 07:55:59,311 --> 07:56:00,508 to describe what's going on. 8235 07:56:00,508 --> 07:56:03,091 But notice that I'm doing something\na little differently here. 8236 07:56:03,091 --> 07:56:07,091 I'm iterating from if\nequals 0 to n minus 2. 8237 07:56:07,591 --> 07:56:11,100 Well, if I'm comparing two\nthings, left hand and right hand 8238 07:56:11,100 --> 07:56:13,050 I'd still want to start at 0. 8239 07:56:13,050 --> 07:56:15,810 But I don't want to go\nall the way to n minus 1 8240 07:56:15,811 --> 07:56:19,409 because then, I'd be going past\nthe boundary of my array, which 8241 07:56:19,951 --> 07:56:22,441 I want to make sure that my\nleft hand-- i, if you will-- 8242 07:56:22,440 --> 07:56:27,181 stops at n minus 2 so that when\nI plus 1 in my pseudo code 8243 07:56:27,181 --> 07:56:29,971 I'm looking at the last two\nelements, not the last element 8244 07:56:31,260 --> 07:56:33,093 That's actually a common\nprogramming mistake 8245 07:56:33,094 --> 07:56:34,980 that we'll undoubtedly\nsoon make by going 8246 07:56:34,980 --> 07:56:37,240 beyond the boundaries of your array. 8247 07:56:37,241 --> 07:56:43,891 So this pseudo code, then, allows me to\n 8248 07:56:43,890 --> 07:56:46,091 and swap them if they're out of order. 8249 07:56:46,091 --> 07:56:50,550 Why do I repeat the whole\nthing n minus 1 times? 8250 07:56:50,550 --> 07:56:55,861 Like, why does it not suffice\njust to do this loop here? 8251 07:56:55,861 --> 07:56:59,341 Think what happened with Celeste. 8252 07:56:59,341 --> 07:57:03,841 Why do I repeat this whole\nthing n minus 1 times? 8253 07:57:12,216 --> 07:57:14,591 DAVID J. MALAN: Indeed, and I think\nif I can recap accurately 8254 07:57:14,591 --> 07:57:16,001 think back to Celeste again. 8255 07:57:16,001 --> 07:57:18,491 And I'm sorry to keep calling\non you as our number 0. 8256 07:57:18,491 --> 07:57:22,031 Each time through bubble\nsort, she only moved one step. 8257 07:57:22,030 --> 07:57:25,640 And so in total, if there's n\nlocations, at the end of the day 8258 07:57:25,640 --> 07:57:30,291 she needs to move n minus 1 steps to get\n 8259 07:57:30,291 --> 07:57:34,691 And so this inner loop, if you\n 8260 07:57:34,690 --> 07:57:37,060 that just fixes some of the problems. 8261 07:57:37,061 --> 07:57:40,511 But it doesn't fix all of the problems\n 8262 07:57:42,021 --> 07:57:45,771 And so how might we quantify the\nrunning time of this algorithm? 8263 07:57:45,771 --> 07:57:49,030 Well, one way to see it is to just\n 8264 07:57:49,030 --> 07:57:53,530 The outer loop repeats n\nminus 1 times by definition. 8265 07:57:54,850 --> 07:58:00,320 The inner loop, the for loop,\nalso iterates n minus 1 times. 8266 07:58:00,820 --> 07:58:02,901 Because it's going from 0 to n minus 2. 8267 07:58:02,901 --> 07:58:06,941 And if that's hard to think about,\n 8268 07:58:06,940 --> 07:58:09,681 if you just add 1 to\nboth ends of the formula. 8269 07:58:09,681 --> 07:58:14,271 So that means you're doing n\nminus 1 things n minus 1 times. 8270 07:58:14,271 --> 07:58:16,901 So I literally multiply how\nmany times the outer loop 8271 07:58:16,901 --> 07:58:20,320 is running by how many times the\n 8272 07:58:20,320 --> 07:58:23,681 sort of FOIL method n minus 1 squared. 8273 07:58:23,681 --> 07:58:25,661 And I could multiply\nthat whole thing out. 8274 07:58:25,661 --> 07:58:28,690 Well, let's consider this just\na little more methodically here. 8275 07:58:28,690 --> 07:58:32,742 If I have n minus 1 on the\nouter, n minus 1 on the inner-- 8276 07:58:32,742 --> 07:58:33,950 let's go ahead and FOIL this. 8277 07:58:33,951 --> 07:58:37,541 So n squared minus n\nminus n plus 1, combine 8278 07:58:37,541 --> 07:58:40,931 like terms-- n squared minus 2n plus 1. 8279 07:58:40,931 --> 07:58:45,831 And now which of these terms is clearly\n 8280 07:58:46,811 --> 07:58:48,081 DAVID J. MALAN: --the n squared. 8281 07:58:48,080 --> 07:58:50,620 So yes, even though\nminus 2n is a good thing 8282 07:58:50,620 --> 07:58:53,021 because it's subtracting off\nsome of the time required 8283 07:58:53,021 --> 07:58:56,050 plus 1 is not that big a thing,\n 8284 07:58:56,050 --> 07:58:58,480 n gets really large, like\nin the millions or billions 8285 07:58:58,480 --> 07:59:02,870 certainly, that bubble sort 2\nis on the order of n squared. 8286 07:59:02,870 --> 07:59:05,470 It's not the same exactly\nas selection sort. 8287 07:59:05,471 --> 07:59:07,390 But as n gets big,\nhonestly, we're barely 8288 07:59:07,390 --> 07:59:09,890 going to be able to notice\nthe difference most likely. 8289 07:59:09,890 --> 07:59:13,550 And so it too might be said to\nbe on the order of n squared. 8290 07:59:13,550 --> 07:59:18,890 And if we consider now the lower\n 8291 07:59:18,890 --> 07:59:23,080 here's where things get\npotentially interesting. 8292 07:59:23,080 --> 07:59:28,810 What might you claim is the running\n 8293 07:59:28,811 --> 07:59:32,591 And the best case, I claim, is when\n 8294 07:59:32,591 --> 07:59:35,081 Is our pseudo code going\nto take that into account? 8295 07:59:43,403 --> 07:59:45,070 DAVID J. MALAN: Yes, and that's the key word. 8296 07:59:45,070 --> 07:59:49,510 To summarize, in bubble sort, I do have\n 8297 07:59:49,510 --> 07:59:52,120 don't look at all n elements,\nthat I'm theoretically 8298 07:59:52,120 --> 07:59:53,620 just guessing if it's sorted or not. 8299 07:59:53,620 --> 07:59:55,841 Like, I obviously\nintuitively have to look 8300 07:59:55,841 --> 07:59:58,751 at every element to decide yay\nor nay, it's in the right order. 8301 07:59:58,751 --> 08:00:01,541 And my original pseudo code,\nthough, is pretty naive. 8302 08:00:01,541 --> 08:00:07,001 It's just going to blindly go back and\n 8303 08:00:08,170 --> 08:00:10,120 But what if I add a\nbit of an optimization 8304 08:00:10,120 --> 08:00:12,370 that you might have glimpsed\non the slide a moment ago 8305 08:00:12,370 --> 08:00:15,970 where if I compare two people and I\n 8306 08:00:15,971 --> 08:00:18,911 don't swap them, and I go all the\nway through the list comparing 8307 08:00:18,911 --> 08:00:22,480 every pair of adjacent\npeople, and I make no swaps 8308 08:00:22,480 --> 08:00:25,181 it would be kind of not\njust naive but stupid 8309 08:00:25,181 --> 08:00:28,721 to do that same process again\n 8310 08:00:28,721 --> 08:00:31,511 I'm not going to make\nany different decisions. 8311 08:00:31,510 --> 08:00:34,100 I'm going to do nothing\nagain, nothing again. 8312 08:00:34,100 --> 08:00:37,510 So at that point, it would be stupid,\n 8313 08:00:38,570 --> 08:00:42,431 So if I modify our pseudo code with\n 8314 08:00:44,411 --> 08:00:50,561 Inside of that same pseudo code, what\n 8315 08:00:50,561 --> 08:00:54,371 Like quit, prematurely before\nthe loops are finished running. 8316 08:00:54,370 --> 08:00:57,740 One of the loops has gone\nthrough per the indentation here. 8317 08:00:57,741 --> 08:01:00,551 But if I do a loop from\nleft to right and I 8318 08:01:00,550 --> 08:01:03,041 have made no swaps, which you\ncan think of as just being 8319 08:01:03,041 --> 08:01:06,611 one other variable that's plus plusing\n 8320 08:01:07,181 --> 08:01:09,100 if I've made no swaps\nfrom left to right 8321 08:01:09,100 --> 08:01:11,600 I'm not going to make any swaps\nthe next time around either. 8322 08:01:11,600 --> 08:01:14,390 So let's just quit at that point. 8323 08:01:14,390 --> 08:01:16,990 And that is to say in the\nbest case, if you will 8324 08:01:16,991 --> 08:01:20,771 when the list is already sorted,\n 8325 08:01:20,771 --> 08:01:25,820 might indeed be omega of n\nif you add that optimization 8326 08:01:25,820 --> 08:01:28,420 so as to short circuit\nall of that inefficient 8327 08:01:28,420 --> 08:01:34,190 looping to do it only as\nmany times as is necessary. 8328 08:01:34,190 --> 08:01:36,640 Let me pause to see if\nthere's any questions here. 8329 08:01:37,552 --> 08:01:46,399 AUDIENCE: [INAUDIBLE] to optimize the\n 8330 08:01:46,399 --> 08:01:47,441 DAVID J. MALAN: Good question. 8331 08:01:47,440 --> 08:01:53,050 If the running time of selection sort\n 8332 08:01:53,050 --> 08:01:58,841 of n squared but selection sort is in\n 8333 08:01:58,841 --> 08:02:01,661 in omega of n, which sounds better-- 8334 08:02:01,661 --> 08:02:04,870 I think if I may, should we\njust always use bubble sort? 8335 08:02:04,870 --> 08:02:09,040 Yes if we think that we\nmight benefit over time 8336 08:02:09,041 --> 08:02:13,611 from a lot of good case\nscenarios or best case scenarios. 8337 08:02:13,611 --> 08:02:15,701 However, the goal at\nhand in just a bit is 8338 08:02:15,701 --> 08:02:17,841 going to be to do even\nbetter than both of these. 8339 08:02:17,841 --> 08:02:19,690 So hold that question\nfurther for a moment. 8340 08:02:20,190 --> 08:02:25,717 AUDIENCE: [INAUDIBLE] n minus 1? 8341 08:02:27,440 --> 08:02:31,251 So I say omega of n, but is it\ntechnically omega of n minus 1? 8342 08:02:31,251 --> 08:02:34,471 Maybe, but again, we're\nthrowing away lower order terms. 8343 08:02:34,471 --> 08:02:38,031 And that's an advantage because we're\n 8344 08:02:38,030 --> 08:02:41,540 Just like I plotted with the\ngreen and yellow and red chart 8345 08:02:41,541 --> 08:02:44,341 I just want to get a sense of\nthe shape of these algorithms 8346 08:02:44,341 --> 08:02:47,690 so that when n gets really\nlarge, which of these choices 8347 08:02:47,690 --> 08:02:49,545 is going to matter the most? 8348 08:02:49,545 --> 08:02:51,920 At the end of the day, it's\nactually perfectly reasonable 8349 08:02:51,920 --> 08:02:53,820 to use selection sort\nor bubble sort if you 8350 08:02:53,820 --> 08:02:56,570 don't have that much data because\n 8351 08:02:56,570 --> 08:02:58,881 My God, our computers\nnowadays are 1 gigahertz 8352 08:02:58,881 --> 08:03:02,800 2 gigahertz, 1 billion things per\n 8353 08:03:02,800 --> 08:03:05,300 But if we have large data sets,\nas we will later in the term 8354 08:03:05,300 --> 08:03:07,830 and as you might in the real world,\n 8355 08:03:07,830 --> 08:03:09,890 then you're going to want\nto be more thoughtful. 8356 08:03:09,890 --> 08:03:11,600 And that's where we're going today. 8357 08:03:11,600 --> 08:03:14,450 All right, so let's actually see\nthis visualized a little bit. 8358 08:03:14,451 --> 08:03:16,370 In a moment, I'm going\nto change screens here 8359 08:03:16,370 --> 08:03:21,710 to open up what is a little\nvisualization tool that will give us 8360 08:03:21,710 --> 08:03:25,280 a sense of how these things actually\n 8361 08:03:25,280 --> 08:03:27,181 than our humans are able\nto do here on stage. 8362 08:03:27,181 --> 08:03:31,940 So here is another visualization of a\n 8363 08:03:31,940 --> 08:03:35,030 Short bars mean small numbers,\ntall bars mean big numbers. 8364 08:03:35,030 --> 08:03:37,280 So instead of having the\nnumbers on their torsos here 8365 08:03:37,280 --> 08:03:42,050 we just have bars that are small or tall\n 8366 08:03:42,050 --> 08:03:44,990 Let me go ahead, and I\npreconfigured this in advance 8367 08:03:44,991 --> 08:03:46,411 to operate somewhat quickly. 8368 08:03:46,411 --> 08:03:49,911 Let's go ahead and do selections\nsort by clicking this button. 8369 08:03:49,911 --> 08:03:52,580 And you'll see some pink bars flying by. 8370 08:03:52,580 --> 08:03:56,000 And that's like me walking\nleft and right, left and right 8371 08:03:56,001 --> 08:03:58,501 to select the next smallest number. 8372 08:03:58,501 --> 08:04:01,940 And so what you'll see happening on\n 8373 08:04:01,940 --> 08:04:04,880 is Celeste, if you will, and\nall of the other smaller numbers 8374 08:04:04,881 --> 08:04:08,390 are appearing on the left while\n 8375 08:04:10,170 --> 08:04:13,537 So again, we no longer have to\ntouch the smaller numbers here. 8376 08:04:13,538 --> 08:04:16,370 So that's why the problem is getting\n 8377 08:04:17,151 --> 08:04:20,781 But you can notice now\nvisually, look at how many times 8378 08:04:22,280 --> 08:04:25,070 This is why things\nthat are n squared tend 8379 08:04:25,070 --> 08:04:29,190 to be frowned upon if avoidable because\n 8380 08:04:29,690 --> 08:04:31,970 When I was walking through, I\nkept pointing at the same humans 8381 08:04:34,021 --> 08:04:37,111 So let's see if bubble sort looks\nor feels a little different. 8382 08:04:37,111 --> 08:04:40,548 Let me re-randomize the thing, and let\n 8383 08:04:40,548 --> 08:04:43,341 And as you might infer, there's\n 8384 08:04:43,341 --> 08:04:44,633 not all of which we'll look at. 8385 08:04:46,100 --> 08:04:48,931 Same pink coloration, but it's\ndoing something different. 8386 08:04:48,931 --> 08:04:52,221 It's two pink bars going through\nagain and again comparing 8387 08:04:53,911 --> 08:04:57,411 And you'll see that the largest\n 8388 08:04:57,411 --> 08:05:02,480 to the right, but the smaller\nnumbers, like our number 0 was 8389 08:05:02,480 --> 08:05:04,648 is only slowly making its way over. 8390 08:05:06,561 --> 08:05:09,261 And it's going to take a while\nto get all the way to the left. 8391 08:05:09,260 --> 08:05:12,920 And here too, notice how\nmany times the same bars 8392 08:05:12,920 --> 08:05:16,950 are becoming pink, how many times the\n 8393 08:05:17,960 --> 08:05:21,501 Because it's only solving one\nproblem at a time on each pass. 8394 08:05:21,501 --> 08:05:25,078 And each time we do that, we're stepping\n 8395 08:05:25,078 --> 08:05:28,161 And now granted, I could speed this\n 8396 08:05:28,161 --> 08:05:32,480 but my God, this is only, what, like\n 8397 08:05:33,501 --> 08:05:36,681 Like, this is what n squared\nlooks like and feels like. 8398 08:05:36,681 --> 08:05:38,931 And now I'm just trying to\ncome up with words to say 8399 08:05:38,931 --> 08:05:40,640 until we get to the finish line here. 8400 08:05:40,640 --> 08:05:43,611 Like, this would be annoying if\nthis is the speed of sorting 8401 08:05:43,611 --> 08:05:47,494 and this is why I sort of secretly\n 8402 08:05:47,493 --> 08:05:49,911 because it would have taken\nus an annoying number of steps 8403 08:05:49,911 --> 08:05:51,570 to get that in place for her. 8404 08:05:51,570 --> 08:05:54,530 So those two algorithms are n squared. 8405 08:05:54,530 --> 08:05:56,626 Can we do, in fact, better? 8406 08:05:56,626 --> 08:05:59,751 Well, to save the best algorithm for\n 8407 08:06:00,251 --> 08:06:04,771 And when we come back, we'll\ndo even better than n squared. 8408 08:06:06,841 --> 08:06:11,541 So the challenge at hand is to\ndo better than selection sort 8409 08:06:11,541 --> 08:06:14,631 and better than bubble sort\nand ideally not just marginally 8410 08:06:14,631 --> 08:06:16,820 better but fundamentally better. 8411 08:06:16,820 --> 08:06:20,330 Just like in week zero, that third\n 8412 08:06:20,330 --> 08:06:23,220 was sort of fundamentally\nfaster than the other two. 8413 08:06:23,221 --> 08:06:26,311 So can we do better than something\non the order of n squared? 8414 08:06:26,311 --> 08:06:28,671 Well, I bet we can if\nwe start to approach 8415 08:06:28,670 --> 08:06:30,440 the problem a little differently. 8416 08:06:30,440 --> 08:06:32,562 The sorts we've done\nthus far, generally known 8417 08:06:32,562 --> 08:06:34,520 as comparison sorts--\nand that kind of captures 8418 08:06:34,521 --> 08:06:38,361 the reality that we were doing a huge\n 8419 08:06:38,361 --> 08:06:41,721 And you kind of saw that in the vertical\n 8420 08:06:41,721 --> 08:06:43,216 was being compared again and again. 8421 08:06:43,216 --> 08:06:45,591 But there's this programming\ntechnique, and it's actually 8422 08:06:45,591 --> 08:06:48,501 a mathematical technique\nknown as recursion 8423 08:06:48,501 --> 08:06:50,311 that we've actually seen before. 8424 08:06:50,311 --> 08:06:53,361 And this is a building\nblock or a mental model 8425 08:06:53,361 --> 08:06:56,661 we can bring to bear on the problem\nto solve the sorting problem 8426 08:06:56,661 --> 08:06:58,100 sort of fundamentally differently. 8427 08:06:58,100 --> 08:07:00,980 But first, let's look at it\nin a more familiar context. 8428 08:07:00,980 --> 08:07:07,550 A little bit ago, I proposed this pseudo\n 8429 08:07:07,550 --> 08:07:10,490 And notice that what was\ninteresting about this code 8430 08:07:10,491 --> 08:07:14,331 even though I didn't call it out at the\n 8431 08:07:14,330 --> 08:07:17,090 Like, I claim this is\nan algorithm for search 8432 08:07:17,091 --> 08:07:21,111 and yet it seems a little unfair\nthat I'm using the verb search 8433 08:07:21,111 --> 08:07:23,320 inside of the algorithm for search. 8434 08:07:23,320 --> 08:07:26,080 It's like an English sort of\ndefining a word by using the word. 8435 08:07:26,080 --> 08:07:28,210 Normally, you shouldn't\nreally get away with that. 8436 08:07:28,210 --> 08:07:30,791 But there's something\ninteresting about this technique 8437 08:07:30,791 --> 08:07:35,471 here because even though this\nwhole thing is a search algorithm 8438 08:07:35,471 --> 08:07:40,871 and I'm using my own algorithm to\n 8439 08:07:40,870 --> 08:07:42,880 the key feature here\nthat doesn't normally 8440 08:07:42,881 --> 08:07:46,031 happen in English when you\ndefine a word in terms of a word 8441 08:07:46,030 --> 08:07:49,661 is that when I search the left\n 8442 08:07:51,170 --> 08:07:52,450 I'm using the same algorithm. 8443 08:07:52,451 --> 08:07:55,421 But the problem is, by\ndefinition, half as large. 8444 08:07:55,420 --> 08:07:58,540 So this isn't going to be a\ncyclical argument in the same way. 8445 08:07:58,541 --> 08:08:02,111 This approach, by using\nsearch within search 8446 08:08:02,111 --> 08:08:05,651 is going to whittle the problem down\n 8447 08:08:05,651 --> 08:08:08,541 one door or no doors remains. 8448 08:08:08,541 --> 08:08:11,081 And so recursion is a\nprogramming technique 8449 08:08:11,080 --> 08:08:14,290 whereby a function calls itself. 8450 08:08:14,291 --> 08:08:18,581 And we haven't seen this yet in C, and\n 8451 08:08:18,580 --> 08:08:22,482 But in C, you can have\na function call itself. 8452 08:08:22,483 --> 08:08:24,190 And the form that\ntakes is like literally 8453 08:08:24,190 --> 08:08:28,540 using the function's name inside of\n 8454 08:08:28,541 --> 08:08:32,671 We've actually seen an opportunity\nfor this once before too. 8455 08:08:33,670 --> 08:08:35,920 Here's that same pseudo code\nfor searching for someone 8456 08:08:35,920 --> 08:08:37,540 in an actual, physical phone book. 8457 08:08:37,541 --> 08:08:40,061 And notice these yellow lines here. 8458 08:08:40,061 --> 08:08:44,171 We described those in week zero\nas inducing a loop, a cycle. 8459 08:08:44,170 --> 08:08:48,670 And this is a very procedural approach,\n 8460 08:08:48,670 --> 08:08:50,830 are very mechanically,\nif you will, telling 8461 08:08:50,830 --> 08:08:54,730 me to go back to line three to\ndo this kind of looping thing. 8462 08:08:54,730 --> 08:08:59,111 But really, what that's doing in the\n 8463 08:08:59,111 --> 08:09:04,480 book is it's just telling me to search\n 8464 08:09:04,480 --> 08:09:08,260 I'm doing it more mechanically\nagain by sort of telling myself 8465 08:09:08,260 --> 08:09:09,850 what line number to go back to. 8466 08:09:09,850 --> 08:09:12,760 But that's equivalent to just telling\n 8467 08:09:12,760 --> 08:09:15,310 search the right half, the\nkey thing being the left 8468 08:09:15,311 --> 08:09:18,081 have and the right half are\nsmaller than the original problem. 8469 08:09:18,080 --> 08:09:21,990 It would be a bug if I just said search\n 8470 08:09:21,991 --> 08:09:23,741 because obviously, you\nnever get anywhere. 8471 08:09:23,741 --> 08:09:25,901 But if you search the\nhalf, the half, the half 8472 08:09:25,901 --> 08:09:27,831 problem gets smaller and smaller. 8473 08:09:27,830 --> 08:09:34,540 So let's reformulate week zero's phone\n 8474 08:09:34,541 --> 08:09:39,501 but recursive whereby in\nthis search algorithm 8475 08:09:39,501 --> 08:09:42,940 AKA binary search, formerly\ncalled divide and conquer, I'm 8476 08:09:42,940 --> 08:09:46,300 going to literally use also\nthe keyword search here. 8477 08:09:46,300 --> 08:09:48,310 Notice among the benefits\nof doing this is it 8478 08:09:48,311 --> 08:09:51,159 kind of tightens the code up,\nmakes it a little more succinct 8479 08:09:51,158 --> 08:09:53,201 even though that's kind\nof a fringe benefit here. 8480 08:09:53,201 --> 08:09:56,710 But it's an elegant\nway too of describing 8481 08:09:56,710 --> 08:10:01,570 a problem by just having\na function use itself 8482 08:10:01,570 --> 08:10:05,690 to solve a smaller puzzle at hand. 8483 08:10:05,690 --> 08:10:08,740 So let's now consider a\nfamiliar problem, a smaller 8484 08:10:08,741 --> 08:10:11,493 version than the one you've dabbled\nwith-- this sort of pyramid 8485 08:10:11,492 --> 08:10:12,700 this half pyramid from Mario. 8486 08:10:12,701 --> 08:10:15,431 And let's throw away the parts\nthat aren't that interesting 8487 08:10:15,431 --> 08:10:19,780 and just consider how we might, up\n 8488 08:10:19,780 --> 08:10:21,880 this left aligned pyramid, if you will. 8489 08:10:21,881 --> 08:10:28,991 Let me go over here, and let me create\n 8490 08:10:28,991 --> 08:10:32,441 And in this file, I'm going to\ngo ahead and include cs50.h. 8491 08:10:32,440 --> 08:10:36,251 And I'm going to include stdio.h. 8492 08:10:36,251 --> 08:10:41,800 And the goal at hand is to implement in\n 8493 08:10:41,800 --> 08:10:43,640 this and exactly this pyramid. 8494 08:10:43,640 --> 08:10:46,473 So no get string or any of that--\n 8495 08:10:46,473 --> 08:10:49,760 and print exactly this\npyramid of height 4 here. 8496 08:10:50,931 --> 08:10:55,306 Well, let me go ahead, and in main,\n 8497 08:10:55,306 --> 08:10:56,931 well, we'll go ahead and generalize it. 8498 08:10:56,931 --> 08:10:58,763 Let's go ahead and ask\nthe user for heights. 8499 08:10:58,763 --> 08:11:00,611 We're using getint as before. 8500 08:11:00,611 --> 08:11:02,861 And I'll store that in a\nvariable called height. 8501 08:11:02,861 --> 08:11:05,111 And then let me go ahead\nand simply call the function 8502 08:11:05,111 --> 08:11:06,771 draw passing in that height. 8503 08:11:06,771 --> 08:11:09,401 So for the moment, let me\nassume that someone somewhere 8504 08:11:09,401 --> 08:11:10,991 has implemented a draw function. 8505 08:11:10,991 --> 08:11:14,171 And this, then, is the\nentirety of my program. 8506 08:11:14,170 --> 08:11:17,330 All right, unfortunately, C does\nnot come with a draw function. 8507 08:11:17,330 --> 08:11:19,180 So let me go ahead and invent one. 8508 08:11:19,181 --> 08:11:20,661 It doesn't need to return a value. 8509 08:11:20,661 --> 08:11:23,210 It just needs to print\nsomething-- so-called side effect. 8510 08:11:23,210 --> 08:11:27,820 So I'm going to define a function\n 8511 08:11:27,820 --> 08:11:30,640 I'll call it n for number, but\nI could call it anything I want. 8512 08:11:32,210 --> 08:11:37,600 I'm going to go ahead and print out a\n 8513 08:11:38,350 --> 08:11:42,070 The salient features here are that this\n 8514 08:11:43,181 --> 08:11:46,600 And now in height four, the\nfirst row has one brick. 8515 08:11:49,631 --> 08:11:52,461 That's a nice pattern that I\ncan probably represent in code. 8516 08:11:53,570 --> 08:11:55,751 Well, how about 4 int i gets-- 8517 08:11:55,751 --> 08:11:57,311 let me do it the old school way-- 8518 08:11:58,091 --> 08:12:02,890 And then i is less than or equal to n. 8519 08:12:04,541 --> 08:12:08,171 so I'm going from 1 to 4 just\nto keep myself sane here. 8520 08:12:08,170 --> 08:12:11,050 And then inside of this\nloop, what do I want to do? 8521 08:12:11,050 --> 08:12:12,920 Well, let me keep it\nconventional, in fact. 8522 08:12:12,920 --> 08:12:16,330 Let me just change this to\nbe the more conventional 0 8523 08:12:16,330 --> 08:12:20,530 to n even though it might not be\n 8524 08:12:21,911 --> 08:12:24,326 On row 1, I want two\nbricks, dot dot dot. 8525 08:12:26,890 --> 08:12:28,600 But I'm being more conventional. 8526 08:12:28,600 --> 08:12:32,800 So on each row, how many\nbricks do I want to print? 8527 08:12:32,800 --> 08:12:34,240 Well, I think I want to do this. 8528 08:12:34,241 --> 08:12:40,061 For int j, for instance, common to\n 8529 08:12:40,061 --> 08:12:47,291 let's start j at 0 and do this\nso long as is less than i plus 1 8530 08:12:50,861 --> 08:12:55,030 Well, again, when I equals 0, that's\n 8531 08:12:55,030 --> 08:12:57,010 When i equals 1, that's the second row. 8532 08:12:57,791 --> 08:13:00,611 And dot dot dot, when i\nis 3, I want four bricks. 8533 08:13:00,611 --> 08:13:03,820 So again, I have to add 1 to i\nto get the total number of bricks 8534 08:13:03,820 --> 08:13:05,480 that I want to print to the screen. 8535 08:13:05,480 --> 08:13:10,181 So inside of this nested for loop,\n 8536 08:13:13,330 --> 08:13:17,472 I'm going to save the new\nline for about here instead. 8537 08:13:17,473 --> 08:13:19,181 All right, the last\nthing I'm going to do 8538 08:13:19,181 --> 08:13:22,911 is copy and paste the prototype\nat the top of the file. 8539 08:13:23,980 --> 08:13:27,088 And again, this is of\nnow week one, week two. 8540 08:13:27,088 --> 08:13:29,381 Wouldn't necessarily come to\nyour mind as quickly as it 8541 08:13:29,381 --> 08:13:32,741 might to mine after all this practice,\n 8542 08:13:32,741 --> 08:13:35,591 of what you yourself did\nalready for Mario-- printing out 8543 08:13:35,591 --> 08:13:38,690 a pyramid that hopefully in a\nmoment is going to look like this. 8544 08:13:38,690 --> 08:13:40,730 So let me go back to my code. 8545 08:13:40,730 --> 08:13:44,920 Let me run make iteration, and\nlet me do dot slash iteration. 8546 08:13:46,901 --> 08:13:50,201 Seems to be correct, and let's assume\n 8547 08:13:56,131 --> 08:13:59,820 So this is indeed an example\nof iteration-- doing something 8548 08:14:02,521 --> 08:14:05,521 Like, I literally have a function\n 8549 08:14:05,521 --> 08:14:09,931 But I can think about implementing\n 8550 08:14:11,070 --> 08:14:13,080 And it's not strictly\nnecessary for this problem 8551 08:14:13,080 --> 08:14:15,720 because this problem honestly\nis not that complicated 8552 08:14:15,721 --> 08:14:17,747 to solve once you have\npractice under your belt. 8553 08:14:17,747 --> 08:14:20,580 Certainly the first time around,\n 8554 08:14:20,580 --> 08:14:23,610 But now that you kind of\nassociate, OK, row one 8555 08:14:23,611 --> 08:14:26,370 with one brick, row two with two\n 8556 08:14:27,701 --> 08:14:30,131 But how else could we\nthink about this problem? 8557 08:14:30,131 --> 08:14:33,300 Well, this physical structure,\nthese bricks, in some sense 8558 08:14:33,300 --> 08:14:39,341 is a recursive structure, a structure\n 8559 08:14:40,721 --> 08:14:45,961 Well, if I were to ask you the question,\n 8560 08:14:45,960 --> 08:14:49,080 look like, you would point,\nof course, to this picture. 8561 08:14:49,080 --> 08:14:55,530 But you could also kind of\ncleverly say to me, well 8562 08:14:55,530 --> 08:15:00,600 it's actually a pyramid of\nheight 3 plus 1 additional row. 8563 08:15:00,600 --> 08:15:02,597 And here's that cyclical\nargument, right? 8564 08:15:02,598 --> 08:15:05,431 Kind of obnoxious to do typically\n 8565 08:15:05,431 --> 08:15:07,561 because you're defining one\nthing in terms of itself. 8566 08:15:07,561 --> 08:15:08,769 What's a pyramid of height 4? 8567 08:15:08,769 --> 08:15:12,541 Well, it's a pyramid of\nheight 3 plus 1 more row. 8568 08:15:12,541 --> 08:15:15,300 But we can kind of leverage\nthis logic in code. 8569 08:15:15,300 --> 08:15:16,951 Well, what's a pyramid of height 3? 8570 08:15:16,951 --> 08:15:19,230 Well, it's a pyramid of\nheight 2 plus 1 more row. 8571 08:15:19,230 --> 08:15:21,300 Fine, what's a pyramid of height 2? 8572 08:15:21,300 --> 08:15:23,730 Well, it's a pyramid of\nheight 1 plus 1 more row. 8573 08:15:23,730 --> 08:15:26,730 And then hopefully, this process\n 8574 08:15:26,730 --> 08:15:29,350 the pyramid is getting\nsmaller and smaller. 8575 08:15:29,350 --> 08:15:32,970 So you're not going to have this\n 8576 08:15:32,971 --> 08:15:36,911 infinitely many times because when\n 8577 08:15:36,911 --> 08:15:38,491 the end of the pyramid, fine. 8578 08:15:38,491 --> 08:15:39,961 What is a pyramid of height 1? 8579 08:15:39,960 --> 08:15:42,990 Well, it's a pyramid of no\nheight plus one more row. 8580 08:15:42,991 --> 08:15:45,631 And at that point, things\njust get negative-- 8581 08:15:46,411 --> 08:15:48,396 Things just would otherwise go negative. 8582 08:15:48,396 --> 08:15:49,771 And so you can just kind of stop. 8583 08:15:49,771 --> 08:15:52,291 The base case is when\nthere is no more pyramid. 8584 08:15:52,291 --> 08:15:56,441 So there's a way to draw a line in the\n 8585 08:15:56,440 --> 08:16:00,540 But this idea of defining a physical\n 8586 08:16:00,541 --> 08:16:06,541 or code in terms of itself actually lets\n 8587 08:16:06,541 --> 08:16:08,521 Let me go back to my code here. 8588 08:16:08,521 --> 08:16:14,161 Let me go ahead and create one\n 8589 08:16:14,161 --> 08:16:20,460 that leverages this idea of this\n 8590 08:16:22,440 --> 08:16:26,100 Let me go ahead and include\nstandardio.h, int main void. 8591 08:16:26,100 --> 08:16:30,360 And then inside of main, I'm going\n 8592 08:16:30,361 --> 08:16:34,831 height equals get int,\nasking the user for height. 8593 08:16:34,830 --> 08:16:38,230 And then I'm going to go ahead\nand call draw passing in height. 8594 08:16:38,230 --> 08:16:39,881 So that's going to stay the same. 8595 08:16:39,881 --> 08:16:45,941 I even am going to make my prototype\n 8596 08:16:45,940 --> 08:16:48,181 And now I'm going to\nimplement void down here 8597 08:16:48,181 --> 08:16:49,951 with that same prototype, of course. 8598 08:16:49,951 --> 08:16:52,831 But the code now is going\nto be a little different. 8599 08:16:54,431 --> 08:16:59,971 Well, first of all, if you ask\nme to draw a pyramid of height n 8600 08:16:59,971 --> 08:17:02,941 I'm going to be kind of a wise\nass here and say, well, just 8601 08:17:02,940 --> 08:17:05,370 draw a pyramid of n minus 1-- 8602 08:17:06,122 --> 08:17:08,580 All right, but there's still\na little more work to be done. 8603 08:17:08,580 --> 08:17:13,020 What happens after I print or\ndraw a pyramid of height n minus 1 8604 08:17:13,021 --> 08:17:17,701 according to our structural\ndefinition a moment ago? 8605 08:17:17,701 --> 08:17:22,831 What remains after drawing a pyramid\n 8606 08:17:25,201 --> 08:17:26,761 We need one more row of hashes. 8607 08:17:26,760 --> 08:17:28,110 OK, so I can do that, right? 8608 08:17:28,111 --> 08:17:29,574 I'm OK with the single loops. 8609 08:17:29,574 --> 08:17:30,991 There's no nesting necessary here. 8610 08:17:30,991 --> 08:17:35,371 I'm just going to do this-- for\nint i get 0, i is less than n 8611 08:17:35,370 --> 08:17:37,545 which is the height that's\npassed in, i plus plus. 8612 08:17:37,545 --> 08:17:39,420 And then inside of this\nloop, I'm very simply 8613 08:17:39,420 --> 08:17:41,100 going to print out a single hash. 8614 08:17:41,100 --> 08:17:45,010 And then down here, I'm going to\n 8615 08:17:45,960 --> 08:17:48,080 I might not be as comfortable\nwith nested loops. 8616 08:17:49,080 --> 08:17:52,020 What does this loop do\nhere on line 17 through 20? 8617 08:17:52,021 --> 08:17:57,901 It literally prints n hashes by\n 8618 08:17:59,491 --> 08:18:02,291 So that's sort of week one style syntax. 8619 08:18:02,291 --> 08:18:05,101 But this is kind of trippy\nnow because I've somehow 8620 08:18:05,100 --> 08:18:09,841 boiled down the implementation of\n 8621 08:18:11,611 --> 08:18:15,611 But this is problematic as\nis because in this case 8622 08:18:15,611 --> 08:18:22,261 my drawer function, notice, is always\n 8623 08:18:23,251 --> 08:18:28,561 But ideally, when do I want\nthis cyclical process to stop? 8624 08:18:28,561 --> 08:18:32,371 When do I want to not call draw anymore? 8625 08:18:35,657 --> 08:18:37,740 When I get to the top of\nthe pyramid, when n is 1 8626 08:18:37,741 --> 08:18:40,081 or heck, when the pyramids\nall gone and n equals 0. 8627 08:18:40,080 --> 08:18:42,420 I can pick any line in\nthe sand, so long as it's 8628 08:18:42,420 --> 08:18:44,050 sort of at the end of the process. 8629 08:18:44,050 --> 08:18:45,841 Then I don't want to call draw anymore. 8630 08:18:45,841 --> 08:18:48,210 So maybe what I should do is this. 8631 08:18:48,210 --> 08:18:54,810 If n equals equals 0, there's\nreally nothing to draw. 8632 08:18:54,811 --> 08:18:58,701 So I'm just going to go\nahead and return like this. 8633 08:18:58,701 --> 08:19:01,281 Otherwise, I'm going\nto go ahead and draw 8634 08:19:01,280 --> 08:19:04,598 n minus 1 rows and then one more row. 8635 08:19:04,598 --> 08:19:06,140 And I could express this differently. 8636 08:19:06,140 --> 08:19:08,900 I could do something like this,\nwhich would be equivalent. 8637 08:19:08,901 --> 08:19:13,851 I could say something like if n\nis greater than or equal to 0 8638 08:19:13,850 --> 08:19:15,823 then go ahead and draw the row. 8639 08:19:15,823 --> 08:19:17,030 But I like it this way first. 8640 08:19:17,030 --> 08:19:18,948 For now, I'm going to\ngo with the original way 8641 08:19:18,948 --> 08:19:22,100 just to ask a simple question and\n 8642 08:19:23,330 --> 08:19:26,100 And heck, just to be\nsuper safe, just in case 8643 08:19:26,100 --> 08:19:28,400 the user types in a\nnegative number, let me also 8644 08:19:28,401 --> 08:19:31,341 just check if n is a negative number,\n 8645 08:19:32,570 --> 08:19:35,850 I'm not returning a value because\nagain, the function is void. 8646 08:19:35,850 --> 08:19:38,040 It doesn't need or have a return value. 8647 08:19:38,041 --> 08:19:40,221 So just saying return suffices. 8648 08:19:40,221 --> 08:19:45,021 But if n equals 1 or 2\nor 3 or anything higher 8649 08:19:45,021 --> 08:19:50,480 it is reasonable to draw a pyramid of\n 8650 08:19:50,480 --> 08:19:55,041 of 4, 3, and then go ahead\nand print one more row. 8651 08:19:55,041 --> 08:20:00,811 So this is an example now of code\n 8652 08:20:02,510 --> 08:20:07,280 But this so-called base case\nensures, this conditional ensures 8653 08:20:07,280 --> 08:20:09,050 that we're not going to do this forever. 8654 08:20:09,050 --> 08:20:11,661 Otherwise, we literally would\ndo this infinitely many times 8655 08:20:11,661 --> 08:20:14,690 and something bad is\nprobably going to happen. 8656 08:20:14,690 --> 08:20:18,591 All right, let me go ahead and\n 8657 08:20:18,591 --> 08:20:22,611 OK, no syntax errors-- dot slash\nrecursion, Enter, height of 4 8658 08:20:24,951 --> 08:20:28,761 If only because some of you have run\n 8659 08:20:28,760 --> 08:20:32,780 let me get rid of the base case\n 8660 08:20:34,341 --> 08:20:36,510 Oh, and actually, now\nit's actually catching it. 8661 08:20:36,510 --> 08:20:39,411 So the compiler is smart\nenough here to realize 8662 08:20:39,411 --> 08:20:42,830 that all paths through this\nfunction will call itself. 8663 08:20:42,830 --> 08:20:45,661 AKA, It's going to loop forever. 8664 08:20:45,661 --> 08:20:47,451 So let me do the first thing. 8665 08:20:47,451 --> 08:20:49,911 Suppose I only check for n equaling 0. 8666 08:20:49,911 --> 08:20:53,751 Let me go ahead and recompile\nthis code with make recursion. 8667 08:20:53,751 --> 08:20:56,390 And now let me just be\nkind of uncooperative. 8668 08:20:56,390 --> 08:21:00,560 When I run this program, still\nworks for 4, still works for 0. 8669 08:21:00,561 --> 08:21:03,471 What if I do like negative 100? 8670 08:21:03,471 --> 08:21:07,281 Have any of you experienced a\nsegmentation fault or core dump? 8671 08:21:08,600 --> 08:21:13,341 Like, this means I have somehow\n 8672 08:21:13,341 --> 08:21:17,271 And in short, I actually called\nthis function thousands of times 8673 08:21:17,271 --> 08:21:19,911 accidentally, it would seem\nnow, until the program just 8674 08:21:19,911 --> 08:21:22,774 bailed on me because I eventually\ntouched memory in the computer 8675 08:21:23,690 --> 08:21:25,501 That'll make even more sense next week. 8676 08:21:25,501 --> 08:21:27,001 But for now, it's simply a bug. 8677 08:21:27,001 --> 08:21:28,791 And I can avoid that\nbug in this context 8678 08:21:28,791 --> 08:21:33,261 probably not your own pset context,\n 8679 08:21:33,260 --> 08:21:35,700 allow for negative numbers at all. 8680 08:21:35,701 --> 08:21:38,061 So with this building\nblock in place, what 8681 08:21:38,061 --> 08:21:41,781 can we now do in terms of\nthose same numbers to sort? 8682 08:21:41,780 --> 08:21:44,751 Well, it turns out there's a\n 8683 08:21:44,751 --> 08:21:46,341 And there's bunches of others too. 8684 08:21:46,341 --> 08:21:51,283 But merge sort is a nice one to discuss\n 8685 08:21:51,283 --> 08:21:53,451 is going to do better than\nselection sort and bubble 8686 08:21:53,451 --> 08:21:55,851 sort that is better than n squared. 8687 08:21:55,850 --> 08:21:58,598 But the catch is it's a\nlittle harder to think about. 8688 08:21:58,598 --> 08:22:01,640 In fact, I'll act it out myself with\n 8689 08:22:01,640 --> 08:22:05,721 rather than humans because recursion\n 8690 08:22:05,721 --> 08:22:08,061 to wrap your mind around,\ntypically a bit of practice. 8691 08:22:08,061 --> 08:22:10,269 But I'll see if we can't\nwalk through it methodically 8692 08:22:10,269 --> 08:22:12,620 enough such that this comes to light. 8693 08:22:12,620 --> 08:22:16,820 So here's the pseudo code I propose\n 8694 08:22:16,820 --> 08:22:20,001 In the spirit of recursion,\nthis sorting algorithm 8695 08:22:20,001 --> 08:22:25,381 literally calls itself by using\n 8696 08:22:25,381 --> 08:22:26,961 So how does merge sort work? 8697 08:22:26,960 --> 08:22:30,380 It sort of obnoxiously says, well, if\n 8698 08:22:30,381 --> 08:22:33,081 go sort the left half, then\ngo sort the right half 8699 08:22:33,080 --> 08:22:34,790 and then merge the two together. 8700 08:22:34,791 --> 08:22:36,043 Now obnoxious in what sense? 8701 08:22:36,043 --> 08:22:38,751 Well, if I just asked you to sort\n 8702 08:22:38,751 --> 08:22:40,431 well, go sort that\nthing and then go sort 8703 08:22:40,431 --> 08:22:43,098 that thing, what was the point\nof asking you in the first place? 8704 08:22:43,098 --> 08:22:45,441 But the key is that\neach of these lines is 8705 08:22:45,440 --> 08:22:48,240 sorting a smaller piece of the problem. 8706 08:22:48,241 --> 08:22:50,811 So eventually, we'll be\nable to pare this down 8707 08:22:50,811 --> 08:22:54,771 into something that doesn't go on\n 8708 08:22:56,480 --> 08:22:59,300 There's a scenario where we\njust check, wait a minute 8709 08:22:59,300 --> 08:23:01,881 if there's only one\nnumber to sort, that's it. 8710 08:23:01,881 --> 08:23:03,831 Quit then because you're all done. 8711 08:23:03,830 --> 08:23:06,950 So there has to be this base\ncase in any use of recursion 8712 08:23:06,951 --> 08:23:11,451 to make sure that you don't\nmindlessly call yourself forever. 8713 08:23:11,451 --> 08:23:13,741 You've got to stop at some point. 8714 08:23:13,741 --> 08:23:16,921 So let's focus on the\nthird of these steps. 8715 08:23:16,920 --> 08:23:21,688 What does it mean to merge two\nlists, two halves of a list 8716 08:23:21,688 --> 08:23:23,480 just because this is\napparently going to be 8717 08:23:23,480 --> 08:23:25,650 a key ingredient-- so\nhere, for instance 8718 08:23:25,651 --> 08:23:28,251 are two halves of a list of size 8. 8719 08:23:28,251 --> 08:23:31,370 We have the numbers 2-- and I'll call\n 8720 08:23:36,561 --> 08:23:41,070 Notice that the left half at the\n 8721 08:23:41,070 --> 08:23:45,361 and the right half, 0136,\nis also sorted as well. 8722 08:23:45,361 --> 08:23:48,230 So that's a good thing because\nit means that theoretically, I've 8723 08:23:48,230 --> 08:23:49,640 sorted the left half already. 8724 08:23:49,640 --> 08:23:51,980 I've sorted the right half\nalready before we began. 8725 08:23:51,980 --> 08:23:54,050 I just need to merge these two halves. 8726 08:23:54,050 --> 08:23:56,161 What does it mean to sort two halves? 8727 08:23:56,161 --> 08:23:57,911 Well, for the sake of\ndiscussion, I'm just 8728 08:23:57,911 --> 08:24:03,701 going to turn over most of the numbers\n 8729 08:24:04,721 --> 08:24:07,751 There's two halves here, left and right. 8730 08:24:07,751 --> 08:24:09,820 At the moment, I'm\nonly going to consider 8731 08:24:09,820 --> 08:24:13,893 the leftmost element of each half--\n 8732 08:24:13,893 --> 08:24:15,100 and the one on the left here. 8733 08:24:15,100 --> 08:24:18,161 How do I merge these two lists together? 8734 08:24:18,161 --> 08:24:22,901 Well, if I look at 2 and I look at 0,\n 8735 08:24:23,863 --> 08:24:25,570 So I'm going to grab\nthe 0, and I'm going 8736 08:24:25,570 --> 08:24:28,510 to put it into its own place\non this new shelf here. 8737 08:24:28,510 --> 08:24:34,661 And now I'm going to consider,\nas part of my iteration 8738 08:24:34,661 --> 08:24:37,631 the beginning of this list and\nthe new beginning of this list. 8739 08:24:37,631 --> 08:24:39,491 So I'm now comparing 2 and 1. 8740 08:24:40,570 --> 08:24:42,760 I'm going to go ahead and grab the 1. 8741 08:24:42,760 --> 08:24:45,490 Now I'm going to compare the\nbeginning of the left list 8742 08:24:45,491 --> 08:24:47,801 and the new beginning of\nthe right list, 2 and 3. 8743 08:24:49,379 --> 08:24:51,671 Now I'm going to compare the\nbeginning of the left list 8744 08:24:51,670 --> 08:24:53,650 and the beginning of\nthe right list, 4 and 3. 8745 08:24:55,870 --> 08:24:58,611 Now I'm going to compare the 4\nagainst the beginning and end 8746 08:24:58,611 --> 08:25:00,191 it turns out, of the second list-- 8747 08:25:01,348 --> 08:25:03,640 Now I'm going to compare the\nbeginning of the left list 8748 08:25:03,640 --> 08:25:05,183 and the beginning of the right list-- 8749 08:25:06,791 --> 08:25:09,371 I'm realizing this is not going\nto end well because I left 8750 08:25:09,370 --> 08:25:10,780 too much distance between the numbers. 8751 08:25:10,780 --> 08:25:12,698 But that has nothing to\ndo with the algorithm. 8752 08:25:12,698 --> 08:25:14,201 7 is the beginning of the left list. 8753 08:25:14,201 --> 08:25:15,820 6 is the beginning of the right list. 8754 08:25:17,320 --> 08:25:20,771 And at the risk of\nknocking all of these over 8755 08:25:20,771 --> 08:25:27,671 if I now make room for this\nelement, we have hopefully 8756 08:25:27,670 --> 08:25:34,760 sorted the whole thing by having merged\n 8757 08:25:40,809 --> 08:25:43,101 I'm a little worried that's\njust getting sarcastic now 8758 08:25:43,100 --> 08:25:48,230 but we now have merged two half lists. 8759 08:25:48,230 --> 08:25:51,791 We haven't done the guts of the\n 8760 08:25:52,791 --> 08:25:55,521 But I claim that that\nis how mechanically you 8761 08:25:57,320 --> 08:25:59,420 You keep looking at the\nbeginning of each list 8762 08:25:59,420 --> 08:26:02,001 and you just kind of\nweave them together based 8763 08:26:02,001 --> 08:26:05,431 on which one belongs\nfirst based on its size. 8764 08:26:05,431 --> 08:26:07,820 So if you agree that\nthat was a reasonable way 8765 08:26:07,820 --> 08:26:12,381 to merge two lists together,\nlet's go ahead and focus lastly 8766 08:26:12,381 --> 08:26:15,621 on what it means to\nactually sort the left half 8767 08:26:15,620 --> 08:26:17,797 and sort the right half of\na whole bunch of numbers. 8768 08:26:17,797 --> 08:26:19,880 And for this, I'm going\nto go ahead and order them 8769 08:26:19,881 --> 08:26:21,681 in this seemingly random order. 8770 08:26:21,681 --> 08:26:24,561 And I just have a little cheat\n 8771 08:26:24,561 --> 08:26:26,691 And I'm going to start at\nthe very top this time. 8772 08:26:26,690 --> 08:26:29,751 And hopefully, these will\nnot fall down at any point. 8773 08:26:29,751 --> 08:26:35,661 But I'm just deliberately putting\n 8774 08:26:41,361 --> 08:26:43,701 Hopefully this won't fall over. 8775 08:26:43,701 --> 08:26:48,351 Here is now an array of\nsize 8 with eight integers. 8776 08:26:49,651 --> 08:26:52,693 I could use selection sort and just\n 8777 08:26:52,692 --> 08:26:55,460 I could use bubble sort and just\ncompare pairs, pairs, pairs. 8778 08:26:55,460 --> 08:26:58,190 But those are going to be on\nthe order of big O of n squared. 8779 08:26:58,190 --> 08:27:00,360 My hope is to do\nfundamentally better here. 8780 08:27:00,361 --> 08:27:02,120 So let's see if we can do better. 8781 08:27:02,120 --> 08:27:04,161 All right, so let me\nlook now at my code. 8782 08:27:05,850 --> 08:27:07,670 How do I implement merge sort? 8783 08:27:07,670 --> 08:27:09,663 Well, if there's only\none number, I quit. 8784 08:27:10,580 --> 08:27:12,630 There's eight numbers,\nso that's not applicable. 8785 08:27:12,631 --> 08:27:14,963 I'm going to go ahead and\nsort the left half of numbers. 8786 08:27:14,963 --> 08:27:16,820 All right, here's the left half-- 8787 08:27:18,830 --> 08:27:21,590 Do I sort an array of size 4? 8788 08:27:21,591 --> 08:27:24,530 Well, here's where the\nrecursion kicks in. 8789 08:27:24,530 --> 08:27:26,510 How do you sort a list of size 4? 8790 08:27:26,510 --> 08:27:28,670 Well, there's the pseudo\ncode on the board. 8791 08:27:28,670 --> 08:27:31,850 I sort the left half\nof the list of size 4. 8792 08:27:37,041 --> 08:27:38,841 All right, now I have a list of size 2. 8793 08:27:50,841 --> 08:27:52,370 If only one number, I'm done. 8794 08:27:53,594 --> 08:27:55,011 All right, what was the next step? 8795 08:27:55,010 --> 08:27:56,540 You have to now rewind in time. 8796 08:27:56,541 --> 08:28:00,871 I just sorted the left half of\nthe left half of the left half. 8797 08:28:07,100 --> 08:28:11,420 So now at this point in the story,\n 8798 08:28:11,420 --> 08:28:14,210 the 5 assorted, and the 2 is sorted. 8799 08:28:14,210 --> 08:28:18,771 But what's the third and final step\n 8800 08:28:20,010 --> 08:28:21,960 So here's the left,\nhere's the right list. 8801 08:28:21,960 --> 08:28:23,210 How do I merge these together? 8802 08:28:23,210 --> 08:28:25,890 I compare the lists,\nand I put the two there. 8803 08:28:25,890 --> 08:28:27,951 I only have the [? 5 ?]\nleft, and I do that. 8804 08:28:27,951 --> 08:28:30,351 So now we see some visible progress. 8805 08:28:32,580 --> 08:28:37,220 We started to sort the left half of\n 8806 08:28:39,260 --> 08:28:42,450 We've just sorted the left\nhalf of the left half. 8807 08:28:42,451 --> 08:28:45,501 So what comes after sorting\nthe left half of anything? 8808 08:28:46,052 --> 08:28:48,260 All right, here's the sort\nof same nonsensical thing. 8809 08:28:55,951 --> 08:28:59,241 So that's the 4, and that's the 7. 8810 08:29:00,471 --> 08:29:05,431 In total, I've now sorted the\nleft half of the original thing. 8811 08:29:07,453 --> 08:29:08,661 Wait a minute, wait a minute. 8812 08:29:10,911 --> 08:29:13,580 I have sorted the left\nhalf of the left half 8813 08:29:13,580 --> 08:29:16,710 and I've sorted the right\nhalf of the left half. 8814 08:29:16,710 --> 08:29:19,070 What do I now need to do lastly? 8815 08:29:19,070 --> 08:29:21,001 Merge those two lists together. 8816 08:29:21,001 --> 08:29:22,922 So again, I put my\nfinger on the beginning 8817 08:29:22,922 --> 08:29:24,630 of this list, the\nbeginning of this list. 8818 08:29:24,631 --> 08:29:26,721 And if you want, I'll do the same\nthing when I merged last time 8819 08:29:26,721 --> 08:29:28,191 to be clear what I'm comparing. 8820 08:29:28,190 --> 08:29:30,800 2 and 4-- the 2 obviously comes first. 8821 08:29:36,030 --> 08:29:40,280 The 5 comes next and then\nlastly, of course, the 7. 8822 08:29:40,280 --> 08:29:44,370 Notice that the 2457 are now sorted. 8823 08:29:44,370 --> 08:29:47,163 So the original left half is sorted. 8824 08:29:47,163 --> 08:29:49,370 And I'll do the rest a little\nfaster because, my God 8825 08:29:49,370 --> 08:29:50,745 this feels like it takes forever. 8826 08:29:50,745 --> 08:29:52,790 But I bet we're on to something here. 8827 08:29:54,681 --> 08:29:56,931 I've just sorted the left\nhalf of the original. 8828 08:29:56,931 --> 08:29:58,640 Sort the right half of the original. 8829 08:29:59,661 --> 08:30:02,241 I sort the left half of the right half. 8830 08:30:03,411 --> 08:30:05,721 I sort the left half of the left half. 8831 08:30:06,681 --> 08:30:08,721 I sort the right half of the left half. 8832 08:30:09,620 --> 08:30:11,751 Now I merge the two together. 8833 08:30:11,751 --> 08:30:14,550 The 1 comes first, the 6 comes next. 8834 08:30:14,550 --> 08:30:18,740 Now I sort the right\nhalf of the right half. 8835 08:30:25,591 --> 08:30:27,440 So that's the third step of that phase. 8836 08:30:27,440 --> 08:30:32,120 Now where are we in the stor-- oh\n 8837 08:30:32,120 --> 08:30:36,591 We have sorted the left\nhalf of the right half 8838 08:30:36,591 --> 08:30:38,541 and the right half of the right half. 8839 08:30:40,623 --> 08:30:42,791 So I'm going to compare,\nand I'm going to move those 8840 08:30:42,791 --> 08:30:44,951 down just to make clear\nwhat I'm comparing 8841 08:30:44,951 --> 08:30:46,661 the beginning of both sublists. 8842 08:30:57,370 --> 08:30:59,290 And then lastly comes the 6. 8843 08:30:59,291 --> 08:31:01,241 All right, where are we in the story? 8844 08:31:01,241 --> 08:31:03,581 We've now sorted the\nleft half of the original 8845 08:31:03,580 --> 08:31:05,140 and the right half of the original. 8846 08:31:07,390 --> 08:31:09,431 All right, so I'm going\nto make the same point. 8847 08:31:09,431 --> 08:31:12,050 And this is actually\nliterally what we did earlier 8848 08:31:12,050 --> 08:31:16,120 because I deliberately demoed those\n 8849 08:31:37,061 --> 08:31:41,121 And lastly-- this is when\nwe run out of memory-- 8850 08:31:41,120 --> 08:31:45,370 the 7 over there is actually in place. 8851 08:31:47,751 --> 08:31:50,227 OK, so admittedly, a\nlittle harder to explain 8852 08:31:50,227 --> 08:31:52,310 and honestly, it gets a\nlittle trippy because it's 8853 08:31:52,311 --> 08:31:55,341 so easy to forget about\nwhere you are in the story 8854 08:31:55,341 --> 08:31:58,041 because we're constantly\ndiving into the algorithm 8855 08:31:58,041 --> 08:31:59,611 and then backing back out of it. 8856 08:31:59,611 --> 08:32:02,331 But in code, we could\nexpress this pretty correctly 8857 08:32:02,330 --> 08:32:05,360 and, it turns out, pretty\nefficiently because what 8858 08:32:05,361 --> 08:32:09,021 I was doing, even though it's\nlonger when I do it verbally 8859 08:32:09,021 --> 08:32:12,620 I was touching these elements a\nminimal amount of times, right? 8860 08:32:12,620 --> 08:32:15,890 I wasn't going back and forth, back\n 8861 08:32:16,850 --> 08:32:21,600 I was deliberately only ever merging\n 8862 08:32:21,600 --> 08:32:24,560 So every time we merge, even\nthough I was doing it quickly 8863 08:32:24,561 --> 08:32:28,730 my fingers were only touching\neach of the elements once. 8864 08:32:28,730 --> 08:32:34,512 And how many times did we divide,\n 8865 08:32:34,512 --> 08:32:36,470 Well, we started with\nall of the elements here 8866 08:32:36,471 --> 08:32:37,679 and there were eight of them. 8867 08:32:37,679 --> 08:32:41,221 And then we moved them\n1, 2, 3 positions. 8868 08:32:41,221 --> 08:32:47,781 So the height of this visualization,\n 8869 08:32:47,780 --> 08:32:50,640 If I started with 8, turns\nout if you do the arithmetic 8870 08:32:50,640 --> 08:32:54,740 this is log n height\nbecause 2 to the 3 is 8. 8871 08:32:54,741 --> 08:32:57,111 But for now, just trust\nthat this is a log n height. 8872 08:32:58,640 --> 08:33:02,631 Well, it's of width n because\nthere's n elements any time 8873 08:33:03,830 --> 08:33:08,097 So technically, I was kind of\n 8874 08:33:08,098 --> 08:33:09,681 is the first time I've needed shelves. 8875 08:33:09,681 --> 08:33:13,370 With the human examples, we just had the\n 8876 08:33:13,881 --> 08:33:16,561 Here, I was sort of using\nmore and more memory. 8877 08:33:16,561 --> 08:33:19,131 In fact, I was using like\nfour times as much memory 8878 08:33:19,131 --> 08:33:21,291 even though that was just\nfor visualization's sake. 8879 08:33:21,291 --> 08:33:25,401 Merge sort actually requires that you\n 8880 08:33:25,401 --> 08:33:28,406 to move the elements into when\nyou're merging them together. 8881 08:33:28,405 --> 08:33:31,280 But if I really wanted and if I\n 8882 08:33:31,280 --> 08:33:33,950 honestly, I could have just gone back\n 8883 08:33:33,951 --> 08:33:35,611 That would have been sufficient. 8884 08:33:35,611 --> 08:33:40,550 So merge sort uses more memory\nfor this merging process 8885 08:33:40,550 --> 08:33:43,460 but the advantage of\nusing more memory is 8886 08:33:43,460 --> 08:33:49,041 that the total running time, if you can\n 8887 08:33:49,041 --> 08:33:51,921 The big O notation for\nmerge sort, it turns out 8888 08:33:51,920 --> 08:33:54,890 is actually going to be n times log n. 8889 08:33:54,890 --> 08:33:57,411 And even if you're a little\nrusty still on your logarithms 8890 08:33:57,411 --> 08:34:02,751 we saw in week zero and again\n 8891 08:34:05,241 --> 08:34:07,611 That's faster than linear\nsearch, which was n. 8892 08:34:07,611 --> 08:34:13,191 So n times log n is, of course,\n 8893 08:34:13,190 --> 08:34:16,190 So it's sort of lower on this little\n 8894 08:34:16,190 --> 08:34:19,710 which is to suggest that it's running\n 8895 08:34:19,710 --> 08:34:22,701 And in fact, if we consider\nthe best case running time 8896 08:34:22,701 --> 08:34:27,006 turns out it's not quite as good\nas bubble sort with omega of n 8897 08:34:27,006 --> 08:34:29,631 where you can just sort of abort\nif you realize, wait a minute 8898 08:34:30,681 --> 08:34:35,210 Merge sort, you actually have to do that\n 8899 08:34:35,210 --> 08:34:40,530 So it's actually in omega and\n 8900 08:34:40,530 --> 08:34:42,591 So again, a trade off\nthere because if you 8901 08:34:42,591 --> 08:34:44,901 happen to have a data set\nthat is very often sorted 8902 08:34:44,901 --> 08:34:47,026 honestly, you might want\nto stick with bubble sort. 8903 08:34:47,026 --> 08:34:49,851 But in the general case,\nwhere the data is unsorted 8904 08:34:49,850 --> 08:34:53,181 n log n as sounding\nbetter than n squared. 8905 08:34:53,181 --> 08:34:55,190 Well, what does it\nactually look or feel like? 8906 08:34:55,190 --> 08:34:58,591 Give me a moment to just change\nover to our visualization here. 8907 08:34:58,591 --> 08:35:02,570 And we'll see with this example\nwhat merge sort looks like 8908 08:35:02,570 --> 08:35:04,530 depicted with now these vertical bars. 8909 08:35:04,530 --> 08:35:07,220 So same algorithm, but instead\nof my numbers on shelves 8910 08:35:07,221 --> 08:35:12,331 here is a random array\nof numbers being sorted. 8911 08:35:12,330 --> 08:35:14,480 And you can see it being\ndone half at a time. 8912 08:35:14,480 --> 08:35:18,080 And you see sort of remnants\nof the previous bars. 8913 08:35:21,861 --> 08:35:26,390 Let me zoom out so you can\nactually see the height here. 8914 08:35:26,390 --> 08:35:29,091 Let me go ahead and randomize\nthis again and run merge sort. 8915 08:35:29,670 --> 08:35:34,460 Now you can see the second array and\n 8916 08:35:34,460 --> 08:35:38,490 And even though this one looks way\n 8917 08:35:38,491 --> 08:35:40,531 it does seem to be moving faster. 8918 08:35:40,530 --> 08:35:44,030 And it seems to be merging halves\ntogether, and boom, it's done. 8919 08:35:44,030 --> 08:35:48,350 So let's actually see, in conclusion,\n 8920 08:35:48,350 --> 08:35:51,440 and consider that moving forward\nas we write more and more code 8921 08:35:51,440 --> 08:35:54,710 the goal is, again, not just to be\n 8922 08:35:54,710 --> 08:35:58,181 And one measure of design is\ngoing to indeed be efficiency. 8923 08:35:58,181 --> 08:36:02,480 So here we have, in final, a\nvisualization of three algorithms-- 8924 08:36:02,480 --> 08:36:05,361 selection sort, bubble\nsort, and merge sort-- 8925 08:36:06,811 --> 08:36:09,980 And let's see what these algorithms\n 8926 08:36:09,980 --> 08:36:12,216 Oh, if we can dim the\nlights for dramatic effect-- 8927 08:36:16,521 --> 08:36:20,331 selection's on top, bubble on\nbottom, merge in the middle. 8928 08:39:13,991 --> 08:39:18,011 DAVID J. MALAN: Well, this is CS50,\n 8929 08:39:18,011 --> 08:39:19,991 and recall that last\nweek, week three, we 8930 08:39:19,991 --> 08:39:22,931 began to explore the inside of\na computer's memory a bit more. 8931 08:39:22,932 --> 08:39:25,992 We talked about arrays, which\nwere just chunks of memory 8932 08:39:25,991 --> 08:39:28,811 back to back to back that really\n 8933 08:39:28,812 --> 08:39:32,082 to bottom, and this is actually a\n 8934 08:39:32,081 --> 08:39:34,121 new to programming,\nand certainly new to C. 8935 08:39:34,121 --> 08:39:39,131 You've seen this approach of just using\n 8936 08:39:40,522 --> 08:39:45,731 So for instance, here is a photo taken\n 8937 08:39:45,731 --> 08:39:49,151 and this is an opportunity to\nexplore exactly what happens 8938 08:39:49,151 --> 08:39:52,271 if we start to zoom in and zoom in and\n 8939 08:39:52,272 --> 08:39:56,022 any TV show like CSI, or\nwhatever, or any movie that 8940 08:39:56,022 --> 08:40:01,961 explores forensic information might\n 8941 08:40:01,961 --> 08:40:05,354 on an image like this to see\nwhat the glint in someone's eye 8942 08:40:05,354 --> 08:40:08,022 is because that reveals the license\nplate number of someone that 8943 08:40:08,917 --> 08:40:10,792 Something that's a little\nover the top there 8944 08:40:10,792 --> 08:40:14,022 but there's an opportunity here to\n 8945 08:40:14,022 --> 08:40:17,022 For instance, let's zoom on\nthis puppet here's eye and let's 8946 08:40:17,022 --> 08:40:19,332 zoom in a little more to\nsee what might be reflected. 8947 08:40:19,331 --> 08:40:21,941 Let's zoom in a little\nmore, and that's it. 8948 08:40:21,941 --> 08:40:24,412 There's only finite\namount of information 8949 08:40:24,412 --> 08:40:26,531 if you have an image\nrepresented in this way. 8950 08:40:26,531 --> 08:40:29,682 We're using pixels-- these dots on\n 8951 08:40:29,682 --> 08:40:32,141 because if you're only using\na finite amount of memory 8952 08:40:32,141 --> 08:40:35,472 then at the end of the day, you can only\n 8953 08:40:35,472 --> 08:40:39,282 At least I don't really see in this\n 8954 08:40:39,281 --> 08:40:42,011 or something like that that you\n 8955 08:40:42,011 --> 08:40:45,041 So today we'll explore these\nkinds of representations 8956 08:40:45,042 --> 08:40:47,862 of how you might use memory\nin new and interesting ways 8957 08:40:47,862 --> 08:40:51,222 to represent now, very\nfamiliar things, but also 8958 08:40:51,222 --> 08:40:54,432 start to explore what some of the\n 8959 08:40:54,432 --> 08:40:58,211 But consider after all that this doesn't\n 8960 08:40:58,211 --> 08:41:00,522 as many pixels as something\nlike this other image 8961 08:41:00,522 --> 08:41:04,492 you can imagine just doing something\n 8962 08:41:04,491 --> 08:41:07,181 And if you think of an image as\njust having rows and columns 8963 08:41:07,182 --> 08:41:09,492 these rows otherwise known\nas scan lines-- something 8964 08:41:09,491 --> 08:41:13,061 we'll explore in the coming week--\n 8965 08:41:13,062 --> 08:41:17,472 by just using two different\nvalues, maybe a zero and a one. 8966 08:41:17,472 --> 08:41:21,502 Or yellow and purple, or vice versa,\n 8967 08:41:21,502 --> 08:41:25,691 Now in practice, recall we talked\n 8968 08:41:25,691 --> 08:41:32,774 but maybe an R, a G, and a B value--\n 8969 08:41:32,775 --> 08:41:33,942 but we'll come back to that. 8970 08:41:33,941 --> 08:41:35,649 That would just be a\nmore involved image. 8971 08:41:35,650 --> 08:41:41,472 But for fun, if today you want to tackle\n 8972 08:41:41,472 --> 08:41:44,891 if you go to this URL here,\nwe've put together an opportunity 8973 08:41:47,562 --> 08:41:51,162 If you go to this URL here, that'll\n 8974 08:41:51,162 --> 08:41:53,502 If you have a laptop\nwith you today that'll 8975 08:41:53,502 --> 08:41:56,902 look a little something like this, which\n 8976 08:41:56,901 --> 08:42:01,241 So if you'd like to go ahead and use\n 8977 08:42:01,241 --> 08:42:04,691 feature to color in those\nindividual squares if you'd like 8978 08:42:04,691 --> 08:42:08,111 see if you can't make something a little\n 8979 08:42:08,112 --> 08:42:12,202 and we'll exhibit some of the best or\n 8980 08:42:12,202 --> 08:42:15,425 So let's transition then to something\n 8981 08:42:15,424 --> 08:42:17,592 And not all of you have\nused, presumably, Photoshop 8982 08:42:17,592 --> 08:42:20,842 but you're probably generally familiar\n 8983 08:42:20,842 --> 08:42:23,062 and creating images\nor photos or the like. 8984 08:42:23,062 --> 08:42:25,992 And here is a screenshot\nof p's color picker 8985 08:42:25,991 --> 08:42:27,978 via which you can\nchange what color you're 8986 08:42:27,978 --> 08:42:30,311 going to draw with the paint\nbrush, or what color you're 8987 08:42:30,312 --> 08:42:32,292 going to fill in with the paint bucket. 8988 08:42:32,292 --> 08:42:34,391 It's representative of any\nkind of graphical tool. 8989 08:42:34,391 --> 08:42:36,801 And there's a lot of\ninformation in here 8990 08:42:36,801 --> 08:42:39,281 but there's perhaps some\nfamiliar terms now-- 8991 08:42:39,281 --> 08:42:43,151 R, G, and B. In fact, right\nnow this is Photoshop's way 8992 08:42:43,151 --> 08:42:45,851 of saying you're about to fill\nin your background or foreground 8993 08:42:45,851 --> 08:42:48,041 with the color black,\nand that appears to be 8994 08:42:48,042 --> 08:42:51,492 represented with an R, a G, and\na B value of zero, zero, zero. 8995 08:42:51,491 --> 08:42:57,341 Or alternatively, using a\nhash symbol and then 000000. 8996 08:42:57,342 --> 08:42:59,801 And if some of you have\nalready made web pages before 8997 08:42:59,801 --> 08:43:01,691 and you know a little\nbit of HTML and CSS 8998 08:43:01,691 --> 08:43:04,031 you probably are familiar\nwith this kind of syntax-- 8999 08:43:04,031 --> 08:43:07,891 a hash symbol and then six, or\n 9000 08:43:07,891 --> 08:43:10,391 And if we look at a few different\ncolors here, for instance 9001 08:43:10,391 --> 08:43:12,491 here might be the\nrepresentation of white. 9002 08:43:12,491 --> 08:43:18,671 Now the R, the G, and the B values\n 9003 08:43:18,671 --> 08:43:23,471 Or alternatively, it looks like\n 9004 08:43:23,472 --> 08:43:26,950 could represent that same\ncolor white with FFFFFF. 9005 08:43:26,950 --> 08:43:28,242 And let's just do a few others. 9006 08:43:28,241 --> 08:43:32,981 Here is red, and it turns out that\n 9007 08:43:37,909 --> 08:43:39,702 So there's perhaps a\npattern here emerging. 9008 08:43:39,702 --> 08:43:43,782 Here is green, zero, 255, zero, a.k.a. 9009 08:43:43,781 --> 08:43:48,022 00FF00, or lastly, here\nblue, which is no red 9010 08:43:48,022 --> 08:43:51,731 no green but apparently a lot\nof blue, 255 again, a.k.a. 9011 08:43:53,831 --> 08:43:57,221 Now some of you, again, might\nhave seen this notation before 9012 08:43:57,222 --> 08:44:00,432 these zeros and these F's and all of\n 9013 08:44:00,432 --> 08:44:02,205 but this is another form of notation. 9014 08:44:02,205 --> 08:44:04,122 And in fact, we'll explore\nthis today-- really 9015 08:44:04,121 --> 08:44:06,851 is just a precondition for\ntalking about some other concepts. 9016 08:44:06,851 --> 08:44:10,002 But the ideas, ultimately,\nare really no different. 9017 08:44:10,002 --> 08:44:13,182 What we're about to see is\na different base system-- 9018 08:44:13,182 --> 08:44:15,312 not just binary, not just\ndecimal, but something 9019 08:44:15,312 --> 08:44:17,231 we're about to call hexadecimal. 9020 08:44:17,231 --> 08:44:21,191 But first, recall that with RGB\nwe previously did the following. 9021 08:44:21,191 --> 08:44:23,592 Any RGB value-- red,\ngreen, blue-- just combine 9022 08:44:23,592 --> 08:44:26,121 some amount of red or green or blue. 9023 08:44:26,121 --> 08:44:30,702 So here we have 72, 73, 33, which in the\n 9024 08:44:33,761 --> 08:44:36,252 Just hi with an exclamation\npoint, but in the context 9025 08:44:36,252 --> 08:44:40,481 of a Photoshop-like program, this\nmight instead be representing 9026 08:44:40,481 --> 08:44:42,919 collectively, this shade\nof yellow, for instance 9027 08:44:42,919 --> 08:44:45,502 when you combine that much red\nthat much green that much blue. 9028 08:44:46,812 --> 08:44:49,062 If you've got a lot of\nred, no green, no blue 9029 08:44:49,062 --> 08:44:50,652 together that's going to give us red. 9030 08:44:50,651 --> 08:44:53,441 If you've got no red, a\nlot of green, no blue 9031 08:44:53,441 --> 08:44:55,211 that's going to give\nus, of course, green. 9032 08:44:55,211 --> 08:44:58,529 If you've got no red, no green,\na lot of blue, that of course 9033 08:44:59,572 --> 08:45:03,761 So there's a pattern emerging here\n 9034 08:45:05,952 --> 08:45:12,641 And it's maybe somehow equated with 255,\n 9035 08:45:12,641 --> 08:45:15,912 Meanwhile, if we combine one last\n 9036 08:45:16,991 --> 08:45:20,719 that's actually going to give us\na single white pixel like this. 9037 08:45:21,761 --> 08:45:25,479 Here was binary-- in the world of binary\n 9038 08:45:25,479 --> 08:45:26,772 Could have been anything else-- 9039 08:45:26,772 --> 08:45:31,902 A or B, X or Y, but the world\nstandardized on these numerals 9040 08:45:32,741 --> 08:45:35,951 In our world's decimal system, of\n 9041 08:45:35,952 --> 08:45:39,461 As of today though, we're going to\n 9042 08:45:39,461 --> 08:45:43,346 in the context of images and also\n 9043 08:45:43,347 --> 08:45:45,195 and there's some conveniences to it. 9044 08:45:45,194 --> 08:45:47,112 Where now, you're going\nto be able to count up 9045 08:45:47,112 --> 08:45:49,961 to F in a notation called hexadecimal. 9046 08:45:49,961 --> 08:45:55,031 From zero through nine, then you keep\n 9047 08:45:55,031 --> 08:45:58,002 the idea being each of these,\neven though it's weirdly 9048 08:45:58,002 --> 08:46:02,141 a letter of the English alphabet,\n 9049 08:46:02,141 --> 08:46:07,601 It's not one zero for 10, or 1 1\n 9050 08:46:07,601 --> 08:46:10,961 these digits, so to speak, are\nindeed still just single symbols 9051 08:46:10,961 --> 08:46:14,572 and that's a characteristic of just\n 9052 08:46:14,572 --> 08:46:20,112 So how do we get from 00 and FF to\n 9053 08:46:20,112 --> 08:46:22,121 Well, this hexadecimal system, a.k.a. 9054 08:46:22,121 --> 08:46:25,546 Base 16, just does the math\nfrom week zero and really 9055 08:46:25,546 --> 08:46:27,171 grade school, a little bit differently. 9056 08:46:27,171 --> 08:46:30,341 For instance, if you have a\nnumber that's got two digits 9057 08:46:30,342 --> 08:46:34,281 or hexadecimal digits as of today, the\n 9058 08:46:34,281 --> 08:46:37,871 Instead of powers of two or powers of\n 9059 08:46:37,871 --> 08:46:40,631 respectively, it's powers of 16. 9060 08:46:40,632 --> 08:46:43,362 So if we just do the math\nout, that's the ones column 9061 08:46:43,362 --> 08:46:46,092 this is the 16s column, and so forth. 9062 08:46:46,092 --> 08:46:49,101 Things get actually pretty big\npretty quickly in this system. 9063 08:46:49,101 --> 08:46:52,106 But now let's just consider how we\n 9064 08:46:52,106 --> 08:46:54,731 If you've got two hexadecimal\ndigits for which these hashes are 9065 08:46:54,731 --> 08:46:57,792 just placeholders, zero, zero\nis going to mathematically 9066 08:46:57,792 --> 08:47:00,292 equal the decimal number you\nand I know, of course, as zero. 9067 08:47:02,081 --> 08:47:06,401 16 times zero plus one times zero is\n 9068 08:47:06,401 --> 08:47:07,881 And we can count up from here. 9069 08:47:07,882 --> 08:47:10,391 This, in hexadecimal,\nwould be how a computer 9070 08:47:10,391 --> 08:47:12,191 represents the number we know as one. 9071 08:47:12,191 --> 08:47:14,182 It would be zero one in this case. 9072 08:47:14,182 --> 08:47:19,542 This would be two, three, four,\nfive, six, seven, eight, nine-- 9073 08:47:19,542 --> 08:47:21,502 in decimal, we're about to go to 10. 9074 08:47:21,502 --> 08:47:24,572 But in hexadecimal, to be\nclear, what comes next? 9075 08:47:24,572 --> 08:47:33,382 So, apparently A, so 0A, 0B, which\n 9076 08:47:33,382 --> 08:47:36,472 So using hexadecimal is\njust an interesting way 9077 08:47:36,472 --> 08:47:40,312 of using single symbols\nnow, zero through F 9078 08:47:40,312 --> 08:47:43,262 to count from zero through 15. 9079 08:47:43,261 --> 08:47:46,011 And we'll see why it's 15 in a\n 9080 08:47:46,011 --> 08:47:50,181 anyone want to conjecture how\nin hexadecimal, a.k.a. hex 9081 08:47:50,182 --> 08:47:53,092 do we now count up one position higher? 9082 08:47:53,092 --> 08:47:56,792 What comes after 0F in hexadecimal? 9083 08:47:56,792 --> 08:47:59,062 So, one zero-- it's the\nsame kind of thing-- 9084 08:47:59,062 --> 08:48:01,227 once you're at the highest\ndigit possible, F-- 9085 08:48:01,226 --> 08:48:03,351 or in our decimal world\nthat would have been nine-- 9086 08:48:03,351 --> 08:48:06,471 you add one more, nine wraps\naround to zero, or in this case 9087 08:48:08,182 --> 08:48:11,152 You carry the one and voila--\nnow we're representing 9088 08:48:11,151 --> 08:48:12,871 the number you and I know as 16. 9089 08:48:12,871 --> 08:48:14,811 And we could keep going\nforever, literally. 9090 08:48:14,812 --> 08:48:18,547 This could be 17, 18,\n19, 20, and decimal-- 9091 08:48:18,546 --> 08:48:20,421 but let's just wave our\nhands at it and count 9092 08:48:20,421 --> 08:48:23,181 as high as we can-- dot,\ndot, dot-- the highest 9093 08:48:23,182 --> 08:48:26,542 we could count in hexadecimal\nwith two digits, just logically 9094 08:48:26,542 --> 08:48:28,342 would be what, in hexadecimal? 9095 08:48:31,312 --> 08:48:34,892 So yes, that's the biggest digit\n 9096 08:48:34,891 --> 08:48:38,524 So how high can you count in hexadecimal\n 9097 08:48:38,524 --> 08:48:39,981 Well, it's the same math as always. 9098 08:48:41,932 --> 08:48:48,301 15, so that's 16 times 15 plus\none times F, or one times 15-- 9099 08:48:48,301 --> 08:48:52,702 that gives us 240 plus 15 in decimal,\n 9100 08:48:54,781 --> 08:48:57,871 So this hexadecimal system-- you may\n 9101 08:48:57,871 --> 08:49:00,621 and if you haven't we'll get to\n 9102 08:49:00,621 --> 08:49:03,351 or we just saw in the\ncontext of Photoshop-- just 9103 08:49:03,351 --> 08:49:09,502 has this shorthand notation of counting\n 9104 08:49:09,502 --> 08:49:13,132 Now it's marginal, but that's like\n 9105 08:49:13,132 --> 08:49:16,852 you need in order to count as high\n 9106 08:49:18,682 --> 08:49:22,492 In hexadecimal you can count\nas high using just two 9107 08:49:22,491 --> 08:49:25,849 and that difference is going to get\n 9108 08:49:25,849 --> 08:49:28,641 Let me stipulate for now, you're\n 9109 08:49:28,641 --> 08:49:31,792 in terms of just how many symbols\n 9110 08:49:31,792 --> 08:49:35,242 bigger and bigger numbers than that. 9111 08:49:35,241 --> 08:49:38,661 All right, let me pause here just to\n 9112 08:49:38,662 --> 08:49:42,081 on what we've called hexadecimal, which\n 9113 08:49:42,081 --> 08:49:48,769 as well as A through F.\nAny questions or confusion? 9114 08:49:48,769 --> 08:49:51,351 And if it feels like we're\nlingering a bit much on arithmetic 9115 08:49:51,351 --> 08:49:54,691 we're not really going to see other\n 9116 08:49:54,691 --> 08:49:58,822 These are the go-to three in a\nprogrammer's world, typically. 9117 08:50:01,600 --> 08:50:03,893 AUDIENCE: Does the hexadecimal\nsymbol take more storage 9118 08:50:06,612 --> 08:50:07,862 DAVID J. MALAN: Good question. 9119 08:50:07,862 --> 08:50:11,972 Does hexadecimal require more storage\n 9120 08:50:11,972 --> 08:50:16,202 Theoretically no, because this is\n 9121 08:50:16,202 --> 08:50:19,082 and we'll see in a concrete\nexample in a moment. 9122 08:50:19,081 --> 08:50:22,471 But inside of the computer, at the end\n 9123 08:50:22,472 --> 08:50:25,588 And using hexadecimal is not\nusing more or fewer bits 9124 08:50:25,588 --> 08:50:27,421 think of this as how\nyou might write it down 9125 08:50:27,421 --> 08:50:30,331 on a piece of paper, just how\nmany digits you're going to write 9126 08:50:30,331 --> 08:50:33,301 or on a computer screen, how many\n 9127 08:50:33,301 --> 08:50:36,572 but it doesn't change how the\n 9128 08:50:36,572 --> 08:50:39,691 because all they're representing at\n 9129 08:50:40,981 --> 08:50:45,211 If this-- a moment ago\nFF I claimed was 255-- 9130 08:50:45,211 --> 08:50:47,252 let's just rewind to week\nzero and if we wanted 9131 08:50:47,252 --> 08:50:51,752 to count to 255 in binary, that's\n 9132 08:50:52,772 --> 08:50:54,604 And there's only a few\nof these numbers that 9133 08:50:54,604 --> 08:50:58,441 are useful to memorize, like 255 is as\n 9134 08:50:58,441 --> 08:51:02,342 if you start at zero, because two to the\n 9135 08:51:04,831 --> 08:51:09,031 So in binary, recall if you have\n 9136 08:51:09,031 --> 08:51:11,351 and I won't do out the\nmath pedantically here 9137 08:51:11,351 --> 08:51:13,726 but if I do do this plus\nthis plus this, dot, dot 9138 08:51:13,726 --> 08:51:16,752 dot-- that's also going to give me 255. 9139 08:51:16,752 --> 08:51:19,801 So this is what's interesting\nhere about hexadecimal. 9140 08:51:19,801 --> 08:51:24,211 It turns out that an upside of\nstoring values in hexadecimal 9141 08:51:24,211 --> 08:51:27,932 is that we're going to\nsee the first F represents 9142 08:51:27,932 --> 08:51:31,262 the left half of all these bits,\nand the second F in this case 9143 08:51:31,261 --> 08:51:33,791 represents the rightmost\nfour of these bits. 9144 08:51:33,792 --> 08:51:36,422 So it turns out hexadecimal\nis very useful when you 9145 08:51:36,421 --> 08:51:39,391 want to treat data in units of four. 9146 08:51:39,391 --> 08:51:42,542 It's not quite eight, but units\nof four, and that's not bad. 9147 08:51:42,542 --> 08:51:45,632 Which is why-- if you use two\ndigits like I have thus far 9148 08:51:45,632 --> 08:51:48,422 00 or FF or anything in between-- 9149 08:51:48,421 --> 08:51:53,281 that's actually a convenient way of\n 9150 08:51:53,281 --> 08:51:57,451 One hex digit for the first four\n 9151 08:51:57,452 --> 08:52:00,152 And again, there's nothing new\nintellectually here per se 9152 08:52:00,151 --> 08:52:03,931 it's just a different way of\n 9153 08:52:05,011 --> 08:52:06,851 So in what context do we see this? 9154 08:52:06,851 --> 08:52:08,191 Well, we talked about\nmemory last week, and we're 9155 08:52:08,191 --> 08:52:09,774 going to talk more about it this week. 9156 08:52:09,775 --> 08:52:12,301 If this is my computer's\nRAM-- random access memory-- 9157 08:52:12,301 --> 08:52:16,471 you can again think of each byte as\n 9158 08:52:18,031 --> 08:52:22,351 This might be zero, this might\nbe 2 billion, and so in the past 9159 08:52:22,351 --> 08:52:25,141 I've described these as just\nthis, using decimal numbers. 9160 08:52:25,141 --> 08:52:29,491 Here's byte zero, one, two, three,\n 9161 08:52:29,491 --> 08:52:30,941 would be here, and so forth. 9162 08:52:30,941 --> 08:52:35,432 But it turns out in the world of memory,\n 9163 08:52:35,432 --> 08:52:40,051 tend to count memory\nbytes using hexadecimal. 9164 08:52:40,051 --> 08:52:42,241 Partly just by convention,\nbut also partly 9165 08:52:42,241 --> 08:52:44,941 because it's a little more\nsuccinct and again, each digit 9166 08:52:44,941 --> 08:52:48,002 represents four bits, typically. 9167 08:52:48,002 --> 08:52:49,757 So what comes after F here? 9168 08:52:49,757 --> 08:52:51,632 Well, if I think about\nthe computer's memory 9169 08:52:51,632 --> 08:52:56,672 I normally might do\nafter F, which is 15, 16. 9170 08:52:56,671 --> 08:53:01,291 But instead, one zero, one\none, one two, one three-- this 9171 08:53:01,292 --> 08:53:05,912 is not 10, 11, 12, 13, because I claim\n 9172 08:53:05,912 --> 08:53:07,981 As per the previous\nslide, we already started 9173 08:53:07,981 --> 08:53:10,801 going into A's through\nF's, so you immediately 9174 08:53:10,801 --> 08:53:13,471 see here a possible problem. 9175 08:53:13,472 --> 08:53:16,442 Why is this now worrisome,\nif all of a sudden you're 9176 08:53:16,441 --> 08:53:22,151 seeing seemingly familiar\nnumbers like 10, 11, 12, 13? 9177 08:53:22,151 --> 08:53:24,288 We didn't really stumble\nacross this problem 9178 08:53:24,289 --> 08:53:25,871 when it was all zeros and ones before. 9179 08:53:26,974 --> 08:53:28,516 AUDIENCE: Try to do math [INAUDIBLE]. 9180 08:53:30,645 --> 08:53:33,312 DAVID J. MALAN: Yeah, so if you're\nwriting some code in C that's 9181 08:53:33,312 --> 08:53:35,170 doing some math, you\nmight accidentally-- 9182 08:53:35,169 --> 08:53:37,961 or the computer might accidentally\n 9183 08:53:37,961 --> 08:53:40,522 if they look in some context the same. 9184 08:53:40,522 --> 08:53:42,612 Any number on the board\nthat doesn't have a letter 9185 08:53:42,612 --> 08:53:46,402 is ambiguously hexadecimal\nor decimal at this point 9186 08:53:46,401 --> 08:53:48,111 and so how might we resolve this? 9187 08:53:48,112 --> 08:53:51,072 Well, it turns out that what\ncomputers typically do is this. 9188 08:53:51,072 --> 08:53:55,842 By convention, any time you\nsee 0x and then a number 9189 08:53:55,842 --> 08:53:58,272 that's a human convention of saying-- 9190 08:53:58,272 --> 08:54:01,731 signaling to the reader that this\n 9191 08:54:01,731 --> 08:54:05,801 So if it's 0x10, that\nis not the number 10 9192 08:54:05,801 --> 08:54:10,971 that is the hexadecimal number one\n 9193 08:54:13,991 --> 08:54:16,511 And again, these are not the\nkinds of things to memorize 9194 08:54:16,511 --> 08:54:19,921 it's really just the system for\n 9195 08:54:19,921 --> 08:54:22,421 So henceforth today, we're going\nto start seeing hexadecimal 9196 08:54:23,831 --> 08:54:26,861 When you write code, you might even\n 9197 08:54:26,862 --> 08:54:29,362 but again, it's just a different\nway of representing numbers 9198 08:54:29,362 --> 08:54:32,621 and humans have different\nconventions for different contexts. 9199 08:54:32,621 --> 08:54:36,131 All right, so with that said, any\n 9200 08:54:36,132 --> 08:54:41,682 But here on out, we'll start\nusing it in some actual code. 9201 08:54:45,441 --> 08:54:49,182 So, let's go ahead and consider\nmaybe a familiar example. 9202 08:54:49,182 --> 08:54:52,932 Something where involving code,\n 9203 08:54:52,932 --> 08:54:54,750 to a value like 50, in this case. 9204 08:54:54,750 --> 08:54:57,042 And then let's start to tinker\naround with what's going 9205 08:54:57,042 --> 08:54:58,752 on inside of the computer's memory. 9206 08:54:58,752 --> 08:55:01,551 In a moment I'm going to load\nup VS Code on my computer 9207 08:55:01,551 --> 08:55:04,871 and I'm going to go ahead and whip\n 9208 08:55:04,871 --> 08:55:08,592 a value like the number\n50 to a variable called n 9209 08:55:08,592 --> 08:55:14,397 but today, keep in mind that\nthat variable n and that value 50 9210 08:55:14,397 --> 08:55:16,764 is going to be stored somewhere\nin my computer's memory 9211 08:55:16,764 --> 08:55:19,932 and it turns out today we'll introduce\n 9212 08:55:19,932 --> 08:55:22,371 see where things are being stored. 9213 08:55:22,371 --> 08:55:24,072 So let me click over to VS Code here. 9214 08:55:24,072 --> 08:55:27,042 I'm going to create a\nprogram called address.c just 9215 08:55:27,042 --> 08:55:29,532 to explore computer's\naddresses today, and I'm 9216 08:55:29,531 --> 08:55:34,061 going to do an include stdio.h,\nint main(void), as usual. 9217 08:55:34,062 --> 08:55:35,802 No command line arguments for now. 9218 08:55:35,801 --> 08:55:38,403 I'm going to declare that\nvariable n equals 50 9219 08:55:38,403 --> 08:55:40,611 and then I'm just going to\ngo ahead and print it out. 9220 08:55:40,612 --> 08:55:46,092 So nothing very interesting but I'll\n 9221 08:55:47,682 --> 08:55:50,672 Nothing here should be very\ninteresting to compile or run 9222 08:55:50,671 --> 08:55:53,171 but I'll do it just to make\nsure I didn't make any mistakes. 9223 08:55:53,171 --> 08:55:58,662 Looks like as expected, it simply\n 9224 08:55:58,662 --> 08:56:02,141 But let's consider then, what this\n 9225 08:56:02,141 --> 08:56:04,882 when it's actually run on your machine. 9226 08:56:04,882 --> 08:56:06,762 So here we have that grid of memory. 9227 08:56:06,761 --> 08:56:10,811 That variable n is an int,\nand if you think back 9228 08:56:10,812 --> 08:56:14,412 how many bytes typically\ndo we use for an int? 9229 08:56:15,491 --> 08:56:18,050 Four, so four bytes, or 32 bits. 9230 08:56:18,050 --> 08:56:21,851 So if each of these squares represents\n 9231 08:56:21,851 --> 08:56:25,173 in my memory, or RAM, is\nusing four of these squares. 9232 08:56:25,173 --> 08:56:27,881 Maybe it ends up over here just\n 9233 08:56:27,882 --> 08:56:29,092 used elsewhere, for instance. 9234 08:56:29,092 --> 08:56:30,842 Though I don't really\nknow, and frankly, I 9235 08:56:30,842 --> 08:56:33,634 don't really care where it ends\n 9236 08:56:33,633 --> 08:56:37,300 So the variable-- the value 50 is\n 9237 08:56:37,300 --> 08:56:40,941 Even though I've written it as\ndecimal, just like in my code-- 9238 08:56:40,941 --> 08:56:45,544 let me again remind that this is 32\n 9239 08:56:45,544 --> 08:56:48,711 it's just going to be very tedious if\n 9240 08:56:48,711 --> 08:56:51,711 so I'll use the more comfortable\nhuman decimal system. 9241 08:56:51,711 --> 08:56:54,502 So that's what's going on\ninside of the computer's memory. 9242 08:56:54,502 --> 08:56:58,932 So what if I actually wanted to\n 9243 08:56:58,932 --> 08:57:01,452 or maybe just knowing its location? 9244 08:57:01,452 --> 08:57:05,262 Well, this variable n\nindeed has a name, n-- 9245 08:57:05,261 --> 08:57:09,123 that's a label of sorts for it--\n 9246 08:57:09,123 --> 08:57:11,831 technically at a specific address,\n 9247 08:57:11,831 --> 08:57:14,861 0x123, and it's 123\nbecause I really don't 9248 08:57:14,862 --> 08:57:17,781 care what it is, I just want an\n 9249 08:57:17,781 --> 08:57:24,311 So way over here off screen might be\n 9250 08:57:24,312 --> 08:57:28,222 It's in hexadecimal\nnotation just by convention. 9251 08:57:28,222 --> 08:57:32,052 So how can I actually see where\nmy variables are ending up 9252 08:57:32,051 --> 08:57:33,702 in memory if I'm curious to do so? 9253 08:57:33,702 --> 08:57:37,182 Well, let me go back to my\ncode here and let me actually 9254 08:57:37,182 --> 08:57:39,441 change this just a little bit. 9255 08:57:39,441 --> 08:57:44,741 Let me go ahead and introduce,\nfor instance, another symbol 9256 08:57:44,741 --> 08:57:48,941 here and another topic\naltogether, namely pointers. 9257 08:57:48,941 --> 08:57:54,471 So a pointer is a variable that\n 9258 08:57:54,472 --> 08:57:57,731 the location of some value\nor more specifically 9259 08:57:57,731 --> 08:58:01,042 the specific byte in which\nthat value is stored. 9260 08:58:01,042 --> 08:58:04,301 So again, if you think of your memory\n 9261 08:58:04,301 --> 08:58:07,061 zero at top left, 2 billion\nor whatever at bottom right 9262 08:58:07,062 --> 08:58:08,562 depending on how much RAM you have-- 9263 08:58:08,562 --> 08:58:10,842 each of those things has\na location, or an address. 9264 08:58:10,842 --> 08:58:14,932 A pointer is just a variable\nstoring one such address. 9265 08:58:14,932 --> 08:58:20,112 So it turns out that in the world of\n 9266 08:58:20,112 --> 08:58:24,472 we can use if we want to see what\n 9267 08:58:24,472 --> 08:58:27,402 and those two operators,\nas of today, are these. 9268 08:58:27,401 --> 08:58:31,191 You can use the ampersand\noperator in C in a couple of ways. 9269 08:58:31,191 --> 08:58:34,121 We already saw it very briefly\nto do ampersand ampersand-- 9270 08:58:34,121 --> 08:58:37,631 it's kind of and two\nBoolean expressions together 9271 08:58:37,632 --> 08:58:39,172 in the context of a conditional. 9272 08:58:40,182 --> 08:58:43,992 A single ampersand is\nthe address of operator. 9273 08:58:43,991 --> 08:58:48,011 So literally, in your code, if you've\n 9274 08:58:48,011 --> 08:58:53,261 and you write &n, C is going to figure\n 9275 08:58:53,261 --> 08:58:55,731 variable n in the computer's memory. 9276 08:58:55,731 --> 08:59:01,362 And it's going to give you a number,\n 9277 08:59:01,362 --> 08:59:05,141 If you want to store that\naddress in a variable 9278 08:59:05,141 --> 08:59:11,202 even though yes, it's a number like\n 9279 08:59:11,202 --> 08:59:17,082 that you want to store not an int\n 9280 08:59:17,081 --> 08:59:20,711 And the syntax for doing that--\nsomewhat nonobviously-- is 9281 08:59:20,711 --> 08:59:24,432 to use an asterisk here,\na star operator, and you 9282 08:59:24,432 --> 08:59:26,231 say this when creating the variable. 9283 08:59:26,231 --> 08:59:30,731 If you want p to be a pointer, that\n 9284 08:59:32,412 --> 08:59:36,551 And the star just tells the computer,\n 9285 08:59:36,551 --> 08:59:40,002 this is the address of\nsomething that yes, is an int 9286 08:59:40,002 --> 08:59:41,761 but we're just being more precise. 9287 08:59:41,761 --> 08:59:44,662 So on the right hand side you\nhave the address of operator. 9288 08:59:44,662 --> 08:59:47,641 As always with the equal sign,\nyou copy from right to left. 9289 08:59:47,641 --> 08:59:51,592 Because &n is by definition the address\n 9290 08:59:51,592 --> 08:59:57,141 in a pointer, and the way to declare a\n 9291 08:59:57,141 --> 09:00:01,191 whose address you're storing, and then\n 9292 09:00:01,191 --> 09:00:04,701 indeed a pointer and not\njust a regular old int. 9293 09:00:04,702 --> 09:00:06,172 So let's see this in practice. 9294 09:00:06,171 --> 09:00:09,231 Let me go back to my own\nsource code here and let 9295 09:00:09,231 --> 09:00:11,241 me make just a couple of tweaks. 9296 09:00:11,241 --> 09:00:13,581 I'm going to leave n\nalone here but I'm going 9297 09:00:13,581 --> 09:00:18,121 to go ahead and initially just do this. 9298 09:00:18,121 --> 09:00:22,702 Let me say int star\np equals ampersand n 9299 09:00:22,702 --> 09:00:27,322 and then down here, I'm going to\n 9300 09:00:28,761 --> 09:00:33,531 And then even though yes, it's just\n 9301 09:00:33,531 --> 09:00:37,671 for integers, there's actually a special\n 9302 09:00:37,671 --> 09:00:40,881 pointers or addresses, and that's %p. 9303 09:00:40,882 --> 09:00:44,182 So now let's go ahead and\nrecompile this, make address-- 9304 09:00:44,182 --> 09:00:49,231 so far so good-- ./address,\nEnter, and a little weirdly 9305 09:00:49,231 --> 09:00:53,871 but perhaps understandably now,\n 9306 09:00:53,871 --> 09:00:57,741 at which the variable n happened to\n 9307 09:00:59,241 --> 09:01:01,791 This computer has a lot\nmore memory so technically 9308 09:01:01,792 --> 09:01:07,852 it was stored at 0x7FFCB4578E5C. 9309 09:01:07,851 --> 09:01:10,011 Now that has no special\nsignificance to me. 9310 09:01:10,011 --> 09:01:12,241 It could have ended up\nsomewhere else altogether 9311 09:01:12,241 --> 09:01:15,741 but this is just where, in my\n 9312 09:01:15,741 --> 09:01:18,261 server to which I'm connected\nusing VS Code here-- 9313 09:01:18,261 --> 09:01:20,859 that just happens to\nbe where n ended up. 9314 09:01:20,859 --> 09:01:23,691 And strictly speaking, I don't even\n 9315 09:01:23,691 --> 09:01:26,541 I could get rid of p\nand I could just say 9316 09:01:26,542 --> 09:01:30,262 print not just n, but the address\n 9317 09:01:30,261 --> 09:01:32,721 You don't need to temporarily\nstore it in a variable. 9318 09:01:32,722 --> 09:01:35,702 Let me just do make\naddress again, ./address 9319 09:01:35,702 --> 09:01:38,282 and now I see this address here. 9320 09:01:38,281 --> 09:01:41,826 And notice if I keep running the\n 9321 09:01:41,827 --> 09:01:44,452 There's other stuff presumably\ngoing on inside of the computer. 9322 09:01:44,452 --> 09:01:47,862 Maybe it's actually randomizing it so\n 9323 09:01:47,862 --> 09:01:50,362 That can actually be a security\nfeature underneath the hood 9324 09:01:50,362 --> 09:01:55,882 but this happens to be at that moment\n 9325 09:01:55,882 --> 09:01:58,852 quite like our picture a moment ago. 9326 09:01:58,851 --> 09:02:02,002 All right, so let me pause\nhere to see if there's now 9327 09:02:02,002 --> 09:02:03,531 any questions on what we just did. 9328 09:02:05,531 --> 09:02:07,752 AUDIENCE: Is there any\nway to control where 9329 09:02:07,752 --> 09:02:10,912 you are storing something in memory? 9330 09:02:10,912 --> 09:02:14,106 Does it even matter if\nit works, or does it just 9331 09:02:14,106 --> 09:02:16,632 matter that you could go in\nand locate where something is? 9332 09:02:16,632 --> 09:02:18,174 DAVID J. MALAN: Really good question. 9333 09:02:18,173 --> 09:02:20,741 Is there any way to control\nwhere something is in memory? 9334 09:02:20,741 --> 09:02:23,698 Short answer is yes, and this is\n 9335 09:02:23,699 --> 09:02:26,531 and we're going to do this today\n 9336 09:02:26,531 --> 09:02:31,601 because with this power of going to or\n 9337 09:02:31,601 --> 09:02:33,701 I could just arbitrarily\nright now write code 9338 09:02:33,702 --> 09:02:37,972 that stores a value at byte 2 billion,\n 9339 09:02:37,972 --> 09:02:42,132 But that also means potentially,\nI could start creepily looking 9340 09:02:42,132 --> 09:02:46,192 around at all of the computer's memory,\n 9341 09:02:46,191 --> 09:02:48,731 Maybe other programs, maybe\nother parts of programs 9342 09:02:48,731 --> 09:02:50,981 and indeed, this is a\npotential security threat 9343 09:02:50,981 --> 09:02:53,345 if suddenly you're able\nto just look anywhere 9344 09:02:53,345 --> 09:02:54,762 you want in the computer's memory. 9345 09:02:54,761 --> 09:02:59,381 Now, I'm overselling it a little bit\n 9346 09:02:59,382 --> 09:03:01,932 there are some defenses\nin place in compilers 9347 09:03:01,932 --> 09:03:05,301 and in our operating systems that\n 9348 09:03:05,301 --> 09:03:07,752 But this is still a very\nfrequent source of problems 9349 09:03:07,752 --> 09:03:10,152 and later today we'll\ntalk briefly about things 9350 09:03:10,151 --> 09:03:13,011 called stack overflow,\nwhich is not just a website 9351 09:03:13,011 --> 09:03:15,191 it is a problem that you can encounter. 9352 09:03:15,191 --> 09:03:17,711 Heap overflow, and more\ngenerally buffer overflows-- 9353 09:03:17,711 --> 09:03:21,162 there's just so many things that can\n 9354 09:03:21,162 --> 09:03:24,761 and if any of you have encountered\na segmentation fault yet? 9355 09:03:24,761 --> 09:03:26,681 I think we saw a few\nhands for that already. 9356 09:03:26,682 --> 09:03:29,262 You touched memory\nthat you shouldn't have 9357 09:03:29,261 --> 09:03:33,971 and odds are you did it most recently\n 9358 09:03:33,972 --> 09:03:37,362 Going to the left, or negative in an\n 9359 09:03:38,202 --> 09:03:42,412 And we'll explain today why it\nis you were able to do that. 9360 09:03:42,412 --> 09:03:44,891 Other questions on\nthese primitives so far? 9361 09:03:46,984 --> 09:03:50,109 AUDIENCE: [INAUDIBLE] pointer star p,\n 9362 09:03:51,391 --> 09:03:52,641 DAVID J. MALAN: Good question. 9363 09:03:53,932 --> 09:03:56,422 Let me rewind in time to the\nprevious version of this code 9364 09:03:56,421 --> 09:03:58,701 where I actually had\na variable called p. 9365 09:03:58,702 --> 09:04:02,512 Just like with variable\ndeclarations in the past 9366 09:04:02,511 --> 09:04:07,981 once you've declared a variable to\n 9367 09:04:07,981 --> 09:04:11,121 star, a.k.a. a pointer,\nyou don't thereafter 9368 09:04:11,121 --> 09:04:14,031 keep using the word\nint or now, the star. 9369 09:04:14,031 --> 09:04:15,831 Once you've declared it, that's it. 9370 09:04:15,831 --> 09:04:17,281 You only refer to it by name. 9371 09:04:17,281 --> 09:04:21,471 And so it's very\ndeliberate what I did here 9372 09:04:21,472 --> 09:04:24,022 saying that the type here is int star-- 9373 09:04:24,022 --> 09:04:26,031 that is a pointer to an int-- 9374 09:04:26,031 --> 09:04:28,971 but here I just said the name\nof the variable, as always. 9375 09:04:28,972 --> 09:04:31,672 I didn't repeat int, and\nI also didn't repeat star. 9376 09:04:31,671 --> 09:04:34,551 But at the risk of bending\none's minds a little bit there 9377 09:04:34,551 --> 09:04:40,801 is unfortunately one other use for the\n 9378 09:04:40,801 --> 09:04:44,542 If you want to print out not\nthe address of something 9379 09:04:44,542 --> 09:04:49,622 but what is at a specific\naddress, you can actually do this. 9380 09:04:49,621 --> 09:04:54,981 If I want to print out the integer\n 9381 09:04:54,981 --> 09:04:59,421 I can actually use the star here, which\n 9382 09:04:59,421 --> 09:05:02,521 said but it has a different\nfunction here-- a different purpose. 9383 09:05:02,522 --> 09:05:04,922 So let me go ahead and do\nthis in two different ways. 9384 09:05:04,921 --> 09:05:06,726 I'm going to leave this\nline of code as is 9385 09:05:06,726 --> 09:05:08,601 but I'm going to add\nanother line of code now 9386 09:05:08,601 --> 09:05:12,561 that prints out what apparently\nwill be an integer, in a moment. 9387 09:05:12,562 --> 09:05:16,485 So %i backslash n, and I could\n 9388 09:05:16,485 --> 09:05:18,652 So there's really nothing\nspecial happening now, I'm 9389 09:05:18,651 --> 09:05:20,661 just adding a sort of\nmindless printing of n. 9390 09:05:20,662 --> 09:05:23,402 So make address, ./address-- 9391 09:05:23,401 --> 09:05:26,961 there's the current address of\nn and there's the value of n. 9392 09:05:26,961 --> 09:05:29,932 But what's kind of\ncool about C here, too 9393 09:05:29,932 --> 09:05:34,222 is if you know that a value is\nat a specific address like p 9394 09:05:34,222 --> 09:05:37,952 there's one other use for this\nstar operator, the asterisk. 9395 09:05:37,952 --> 09:05:41,582 You can use it as the\nso-called dereference operator 9396 09:05:41,581 --> 09:05:44,431 which means go to that address. 9397 09:05:44,432 --> 09:05:50,062 And so here what we actually have\nis an example of a pointer p 9398 09:05:50,062 --> 09:05:54,992 which is an address like\n0x123 or 0x7FF and so forth. 9399 09:05:54,991 --> 09:05:58,551 But if you say star p now, you're\nnot redeclaring the variable 9400 09:05:58,551 --> 09:05:59,991 because I didn't mention int-- 9401 09:05:59,991 --> 09:06:02,751 you're going to that address in p. 9402 09:06:02,752 --> 09:06:04,432 So let me recompile this now. 9403 09:06:04,432 --> 09:06:10,552 Make address, ./address,\nand just to be clear-- 9404 09:06:12,082 --> 09:06:15,592 I'm first going to see the\npointer itself, 0x something. 9405 09:06:15,591 --> 09:06:18,456 What's the second line of output\nI should presumably see now? 9406 09:06:22,951 --> 09:06:27,271 So I'm hearing 50, and that's true\n 9407 09:06:27,271 --> 09:06:33,511 of n and print it in line seven, but\n 9408 09:06:33,512 --> 09:06:36,692 that's indeed going to just\nshow you the number n-- 9409 09:06:39,482 --> 09:06:42,389 All right, any questions now on\n 9410 09:06:42,389 --> 09:06:44,222 I think this is confusing--\nthe fact that we 9411 09:06:44,222 --> 09:06:46,412 use the star for\nmultiplication, the fact 9412 09:06:46,411 --> 09:06:48,721 that we use the star\nto declare a pointer 9413 09:06:48,722 --> 09:06:51,961 but then we use a star in a third\nway to dereference the pointer 9414 09:06:53,012 --> 09:06:56,612 It's just too confusing, honestly,\n 9415 09:07:07,862 --> 09:07:09,112 DAVID J. MALAN: Good question. 9416 09:07:09,112 --> 09:07:12,682 Do you-- when you are using\nthe ampersand operator 9417 09:07:12,682 --> 09:07:14,631 to get the address of\nsomething, the onus 9418 09:07:14,631 --> 09:07:18,771 is on you at the moment to know\n 9419 09:07:22,042 --> 09:07:25,402 I wrote this code so I\nknow in line six that I'm 9420 09:07:25,402 --> 09:07:28,491 trying to get the address\nof what is an integer. 9421 09:07:28,491 --> 09:07:30,631 AUDIENCE: What about line eight? 9422 09:07:30,631 --> 09:07:34,351 DAVID J. MALAN: In line\neight you don't have 9423 09:07:34,351 --> 09:07:36,182 to worry about that-- good question. 9424 09:07:36,182 --> 09:07:40,211 Notice in line eight, I didn't tell\n 9425 09:07:40,211 --> 09:07:44,911 what kind of address I'm going\n 9426 09:07:44,911 --> 09:07:47,941 I told the compiler\nthat p, now and forever 9427 09:07:47,942 --> 09:07:50,402 is going to be the address of an int. 9428 09:07:50,402 --> 09:07:55,322 That's enough information in advance so\n 9429 09:07:55,322 --> 09:07:59,311 still knows on line eight\nthat p is a pointer to an int 9430 09:07:59,311 --> 09:08:02,731 and that way it will print out\nall four bytes at that address 9431 09:08:02,732 --> 09:08:06,649 not just part of it, and not\nmore than those four bytes. 9432 09:08:09,161 --> 09:08:10,661 AUDIENCE: Do pointers have pointers? 9433 09:08:10,661 --> 09:08:11,961 DAVID J. MALAN: Do\npointers have pointers? 9434 09:08:12,461 --> 09:08:16,091 We won't do this today by\nhaving pointers to pointers 9435 09:08:16,091 --> 09:08:19,781 but yes, you can use star\nstar, and then things get-- 9436 09:08:21,671 --> 09:08:23,862 We won't do that today and\nwe won't do that often. 9437 09:08:23,862 --> 09:08:26,412 In fact Python, another language,\nis just a couple of weeks 9438 09:08:31,692 --> 09:08:33,552 That was-- more verbal\nfeedback like that 9439 09:08:33,552 --> 09:08:36,232 is helpful as we forge into\nthe more complicated stuff. 9440 09:08:38,269 --> 09:08:40,145 AUDIENCE: What's the\npoint of [INAUDIBLE]?? 9441 09:08:43,432 --> 09:08:46,521 DAVID J. MALAN: What's the\npoint of printing the address? 9442 09:08:46,521 --> 09:08:49,811 AUDIENCE: Like, using the\naddress to [INAUDIBLE].. 9443 09:08:50,741 --> 09:08:51,881 What's the point of doing this? 9444 09:08:51,881 --> 09:08:54,131 If you don't mind, let me--\nlet's get there in a moment. 9445 09:08:54,131 --> 09:08:56,831 This is not the common use case,\njust printing out the address-- 9446 09:08:58,182 --> 09:09:00,762 At the moment we care only\nfor the sake of discussion. 9447 09:09:00,762 --> 09:09:02,814 We're soon going to start\nusing these addresses. 9448 09:09:02,813 --> 09:09:05,021 So hang in there just a\nlittle bit for that one, too 9449 09:09:05,021 --> 09:09:08,981 but it will solve some\nproblems for us before long. 9450 09:09:08,982 --> 09:09:12,672 So let's actually just now depict what\n 9451 09:09:15,052 --> 09:09:19,332 So if I toggle back here, let\nme redraw my computer's memory 9452 09:09:19,332 --> 09:09:22,781 now let me plop into the memory n,\n 9453 09:09:23,832 --> 09:09:25,992 Where is p in my computer's memory? 9454 09:09:25,991 --> 09:09:29,051 Specifically, I don't know and\n 9455 09:09:29,052 --> 09:09:31,102 run the program so for\nthe sake of discussion 9456 09:09:31,101 --> 09:09:36,072 let's just propose that if 50 ended\n 9457 09:09:36,072 --> 09:09:38,832 p ends up over here, at address-- 9458 09:09:38,832 --> 09:09:42,022 whoops-- at whatever\naddress this is here. 9459 09:09:42,021 --> 09:09:44,471 But notice a couple of curiosities now. 9460 09:09:44,472 --> 09:09:47,982 If p is a pointer, it's\nthe address of something. 9461 09:09:47,982 --> 09:09:53,322 So the value in p should be an address,\n 9462 09:09:53,322 --> 09:09:57,432 0x123, and technically there's not\n 9463 09:09:57,432 --> 09:09:59,832 there's not even a 123\nthere per se-- there's 9464 09:09:59,832 --> 09:10:03,372 a pattern of bits that\nrepresents the address 0x123. 9465 09:10:03,372 --> 09:10:07,042 But again, that's weak zero--\n 9466 09:10:07,042 --> 09:10:13,122 So if this is p, and this I claimed\n 9467 09:10:13,122 --> 09:10:15,592 Can someone conjecture here? 9468 09:10:15,591 --> 09:10:20,421 Because it turns out whether n\nis an int or a char or a bool 9469 09:10:20,421 --> 09:10:23,061 which are different\ntypes-- heck, even a long-- 9470 09:10:23,061 --> 09:10:27,231 it turns out that p is always going\n 9471 09:10:36,868 --> 09:10:40,812 AUDIENCE: Perhaps it\nallocates eight bytes 9472 09:10:40,811 --> 09:10:44,319 but it doesn't know the type\nof the data [INAUDIBLE].. 9473 09:10:45,362 --> 09:10:47,552 Maybe it's allocating eight bytes\n 9474 09:10:47,552 --> 09:10:50,072 Turns out that's OK because\nan address is an address. 9475 09:10:50,072 --> 09:10:53,641 It's really up to the programmer to\n 9476 09:10:55,741 --> 09:11:00,803 AUDIENCE: Maybe the first four for\n 9477 09:11:00,803 --> 09:11:06,393 is some null that [INAUDIBLE]\nwhere the pointer ends. 9478 09:11:06,394 --> 09:11:07,601 DAVID J. MALAN: OK, possibly. 9479 09:11:07,601 --> 09:11:10,572 It could be that pointers have\n 9480 09:11:10,572 --> 09:11:13,451 or something curious like that,\n 9481 09:11:13,451 --> 09:11:15,112 Turns out that's not the case. 9482 09:11:15,112 --> 09:11:18,641 It turns out that pointers\nnowadays typically are, but not 9483 09:11:18,641 --> 09:11:21,281 always are eight bytes, a.k.a. 9484 09:11:21,281 --> 09:11:24,461 64 bits, because you and\nI-- our Macs, our PCs 9485 09:11:24,461 --> 09:11:28,271 heck-- even our phones have a lot\n 9486 09:11:28,271 --> 09:11:30,161 Back in the day, a\npointer might have only 9487 09:11:30,161 --> 09:11:34,061 been 32 bits, or even only\neight bits way back in the day. 9488 09:11:34,061 --> 09:11:36,911 It's considered 32 bits, because\n 9489 09:11:36,911 --> 09:11:40,451 How high can you count,\nroughly, if you've got 32 bits? 9490 09:11:40,451 --> 09:11:43,261 What's the number we keep rattling off? 9491 09:11:43,262 --> 09:11:48,422 32 bits is roughly 2 to\nthe 32, so it's 4 billion 9492 09:11:48,421 --> 09:11:52,631 and I keep saying it's 2 billion if you\n 9493 09:11:52,631 --> 09:11:55,891 there's a reason I keep saying\n2 billion bytes, two gigabytes 9494 09:11:55,891 --> 09:11:58,951 because for a very long time that\n 9495 09:12:00,482 --> 09:12:02,852 Because the pointers that\nthe computers were using 9496 09:12:02,851 --> 09:12:04,891 were only, for instance, 32 bits. 9497 09:12:04,891 --> 09:12:07,951 And with 32 bits, depending on whether\n 9498 09:12:07,951 --> 09:12:10,981 you can count as high as 2 billion,\nroughly, or maybe 4 billion 9499 09:12:10,982 --> 09:12:13,322 but you know what-- your\nMac, your PC, your phone 9500 09:12:13,322 --> 09:12:17,802 could not have had five gigabytes of\n 9501 09:12:17,802 --> 09:12:20,552 You certainly couldn't have had\n 9502 09:12:20,552 --> 09:12:22,532 which might be 8 gigabytes of memory-- 9503 09:12:24,572 --> 09:12:28,862 Because with 4 bytes, or 32\nbits, you literally, physically 9504 09:12:28,862 --> 09:12:32,972 can't count that high, which means if I\n 9505 09:12:32,972 --> 09:12:36,662 would run out of numbers to describe\n 9506 09:12:37,991 --> 09:12:41,131 So pointers nowadays are\n64 bits, or eight bytes. 9507 09:12:41,881 --> 09:12:43,798 I can't even pronounce\nhow big that number is 9508 09:12:43,798 --> 09:12:46,411 but it's plenty for the\nnext many years, and so 9509 09:12:46,411 --> 09:12:48,241 we've drawn it that\nway on the board here. 9510 09:12:48,241 --> 09:12:49,862 Now let's just abstract this away. 9511 09:12:49,862 --> 09:12:51,569 Let's get rid of all\nthe other bytes that 9512 09:12:51,569 --> 09:12:54,271 are storing something or\nnothing else, and let's now 9513 09:12:54,271 --> 09:12:57,601 start to abstract away this\ncomplexity because the reality is 9514 09:12:59,491 --> 09:13:01,801 what is this useful for, or\nwhat do we-- do we actually 9515 09:13:04,322 --> 09:13:06,421 We're doing this so that\nyou see there's no magic. 9516 09:13:06,421 --> 09:13:09,311 We're just moving things around\nand poking around in memory. 9517 09:13:09,311 --> 09:13:12,151 But what a person would typically\ndo when talking about pointers 9518 09:13:12,152 --> 09:13:14,762 would literally be to\njust point at something. 9519 09:13:14,762 --> 09:13:17,312 I really don't care\nwhat address n is at 9520 09:13:17,311 --> 09:13:20,491 so it suffices when general, when\n 9521 09:13:20,491 --> 09:13:22,381 having a discussion\nwith another programmer 9522 09:13:22,381 --> 09:13:26,701 you just draw an arrow from the\n 9523 09:13:26,701 --> 09:13:31,830 because neither you nor I probably care\n 9524 09:13:31,830 --> 09:13:35,173 There's your pointer-- it's literally\n 9525 09:13:35,173 --> 09:13:37,381 So it turns out that these\npointers, these addresses 9526 09:13:37,381 --> 09:13:41,191 are not that dissimilar to what\nwe've done for hundreds of years 9527 09:13:41,192 --> 09:13:43,742 in the form of a postal system. 9528 09:13:43,741 --> 09:13:45,481 For instance, here is a post office-- 9529 09:13:45,482 --> 09:13:48,092 here, no-- here is a\nmailbox, and suppose 9530 09:13:48,091 --> 09:13:50,791 that this is a mailbox labeled p. 9531 09:13:50,792 --> 09:13:53,552 It's a pointer, and suppose\nthere's another mailbox 9532 09:13:53,552 --> 09:13:57,402 way over there, which is just\n 9533 09:13:57,402 --> 09:13:59,192 What are we really talking about? 9534 09:13:59,192 --> 09:14:03,242 Well, you store in a computer's\n 9535 09:14:03,241 --> 09:14:07,201 or the word "hi" inside of your\n 9536 09:14:07,201 --> 09:14:11,281 But today we can also use\nthose same memory locations 9537 09:14:11,281 --> 09:14:12,911 to store the address of things. 9538 09:14:12,911 --> 09:14:16,711 For instance, if I\nopen this up here and I 9539 09:14:16,711 --> 09:14:20,432 see OK, the value inside of this\n 9540 09:14:21,722 --> 09:14:26,222 0x123-- that's like a\npointer, a breadcrumb leading 9541 09:14:26,222 --> 09:14:28,022 from one location in memory to another. 9542 09:14:28,021 --> 09:14:30,521 And in fact, would someone who's\nseated roughly over there-- 9543 09:14:30,521 --> 09:14:33,121 do you mind getting the mail over there? 9544 09:14:33,122 --> 09:14:35,942 Any volunteers over in this section? 9545 09:14:35,942 --> 09:14:38,292 Just need you to get to\nthe mailbox before I do. 9546 09:14:40,832 --> 09:14:46,287 Whoever is gesturing most\nwildly, come on down. 9547 09:14:58,561 --> 09:15:01,441 OK, come on up to the edge of the\n 9548 09:15:01,442 --> 09:15:05,162 if this is p, that is\napparently n, but to make clear 9549 09:15:05,161 --> 09:15:07,981 what we're talking about when\nwe're storing 0x whatever values-- 9550 09:15:07,982 --> 09:15:11,132 like 0x123, that's\nessentially equivalent to my 9551 09:15:11,131 --> 09:15:13,862 maybe pulling out something\nlike this and just 9552 09:15:13,862 --> 09:15:16,412 abstractly pointing\nto your mailbox there 9553 09:15:16,411 --> 09:15:20,671 or if you prefer,\npointing to the mailbox-- 9554 09:15:28,021 --> 09:15:30,182 This is akin to me\npointing at your mailbox 9555 09:15:30,182 --> 09:15:32,224 and if you want to go\nahead and open your mailbox 9556 09:15:32,224 --> 09:15:38,562 and reveal to the crowd what's\ninside your mailbox labeled n. 9557 09:15:43,961 --> 09:15:46,582 We have a little CS50 stress\nball for your trouble. 9558 09:15:47,913 --> 09:15:50,621 So that's just to put a visual on\n 9559 09:15:50,622 --> 09:15:53,532 because it can get very abstract,\n 9560 09:15:53,531 --> 09:15:56,752 talking about addresses and memory and\n 9561 09:15:56,752 --> 09:15:59,669 But if you think about just walking\n 9562 09:15:59,669 --> 09:16:02,622 complex that's got a lot of\nmailboxes, those mailboxes 9563 09:16:02,622 --> 09:16:05,592 essentially are a big\nchunk of memory and each 9564 09:16:05,591 --> 09:16:07,451 of those mailboxes has an address-- 9565 09:16:07,451 --> 09:16:10,182 this is apartment one, two,\nthree-- apartment 2 billion. 9566 09:16:10,182 --> 09:16:13,451 And inside of those\nmailboxes can go anything 9567 09:16:13,451 --> 09:16:15,621 that can be represented as information. 9568 09:16:15,622 --> 09:16:18,702 It could be a number\nlike n, or 50, or if you 9569 09:16:18,701 --> 09:16:21,101 prefer it could be a\nnumber that represents 9570 09:16:21,101 --> 09:16:22,991 the address of another mailbox. 9571 09:16:22,991 --> 09:16:26,171 And this is akin, really, if\nyou've ever had an apartment or you 9572 09:16:26,171 --> 09:16:28,991 and your parents have moved,\nto having a forwarding address. 9573 09:16:28,991 --> 09:16:31,362 It's like having the\nPost Office in the US 9574 09:16:31,362 --> 09:16:34,842 put some kind of piece of paper\nin your old mailbox saying 9575 09:16:34,841 --> 09:16:37,271 actually forward it\nto that other mailbox. 9576 09:16:37,271 --> 09:16:39,641 That really is all a pointer is doing. 9577 09:16:39,641 --> 09:16:41,351 At the end of the day,\nit's just a number 9578 09:16:41,351 --> 09:16:43,691 but it's a number being\nused in a different way 9579 09:16:43,692 --> 09:16:45,822 and it's the syntax\nthat we've introduced 9580 09:16:45,822 --> 09:16:49,631 not just int but int star,\nthat tells the computer how 9581 09:16:49,631 --> 09:16:54,101 to treat that number in\nthis slightly different way. 9582 09:16:54,101 --> 09:16:57,201 Are there any questions then, on this? 9583 09:16:59,322 --> 09:17:01,739 AUDIENCE: If you had a variable,\nlike int c, [INAUDIBLE].. 9584 09:17:06,072 --> 09:17:08,052 DAVID J. MALAN: If I did int c and-- 9585 09:17:12,372 --> 09:17:14,502 Equal to n, so let me\nactually type it out. 9586 09:17:14,502 --> 09:17:16,631 If I give myself another\nline of code, tell me 9587 09:17:16,631 --> 09:17:22,612 one last time what to type.\nint is equal to n, like this? 9588 09:17:22,612 --> 09:17:27,311 So this is OK, and I can't draw it\n 9589 09:17:27,311 --> 09:17:31,542 but this would be like creating another\n 9590 09:17:31,542 --> 09:17:35,592 down here, that stores\nan identical copy of 50 9591 09:17:35,591 --> 09:17:38,741 because the assignment operator\n 9592 09:17:39,561 --> 09:17:43,031 So that would just add one\nmore rectangle of size four 9593 09:17:45,752 --> 09:17:47,732 If I'm answering your\nquestion as intended. 9594 09:17:47,732 --> 09:17:52,592 OK, so that is week one style use of\n 9595 09:17:52,591 --> 09:17:55,411 I could, though, start copying\npointers but again, we'll 9596 09:17:55,411 --> 09:17:57,241 come back to some of that complexity. 9597 09:17:58,781 --> 09:18:00,281 AUDIENCE: That was a great question. 9598 09:18:02,201 --> 09:18:05,444 does the same pointer point\nto the new replica as well? 9599 09:18:05,444 --> 09:18:06,862 DAVID J. MALAN: Ah, good question. 9600 09:18:07,766 --> 09:18:12,461 And to repeat for the camera, if I\n 9601 09:18:12,461 --> 09:18:16,631 int c equals n, and I claim without\n 9602 09:18:16,631 --> 09:18:20,551 that this gives me another rectangle,\n 9603 09:18:22,042 --> 09:18:24,402 And this is what's important\nand really characteristic 9604 09:18:24,402 --> 09:18:28,362 of C. Nothing happens\nautomatically for you. 9605 09:18:28,362 --> 09:18:31,942 p is not going to be updated\nunless you update p in some way 9606 09:18:31,942 --> 09:18:34,482 so creating a third\nvariable called c-- even 9607 09:18:34,482 --> 09:18:36,882 if you're copying its\nvalue from right to left 9608 09:18:36,881 --> 09:18:40,061 that has no effect on\nanything else in the program. 9609 09:18:41,391 --> 09:18:47,561 So what have we seen that's perhaps\n 9610 09:18:47,561 --> 09:18:51,581 Well, recall that we talked quite a\n 9611 09:18:51,582 --> 09:18:57,461 to recap in layperson's terms, what is\n 9612 09:18:57,461 --> 09:18:59,551 So say-- well, let me\ntake a specific hand here. 9613 09:19:02,286 --> 09:19:03,661 AUDIENCE: An array of characters. 9614 09:19:06,332 --> 09:19:09,122 An array of characters, and we-- 9615 09:19:09,122 --> 09:19:12,242 I claimed-- or revealed last week\nthat string is not technically 9616 09:19:12,241 --> 09:19:15,511 a feature built into C. It's\nnot an official data type 9617 09:19:15,512 --> 09:19:17,762 but every programmer\nin most any language 9618 09:19:17,762 --> 09:19:21,002 refers to sequences of\ncharacters-- words, letters 9619 09:19:22,811 --> 09:19:26,131 So the vernacular exists but\nthe data type doesn't typically 9620 09:19:26,131 --> 09:19:29,471 exist per se in C. So what\nwe're about to do, if you will 9621 09:19:29,472 --> 09:19:32,311 for dramatic effect, is take\noff some training wheels today. 9622 09:19:32,311 --> 09:19:36,811 The CS50 library implemented in the\n 9623 09:19:36,811 --> 09:19:38,941 we claim has had a\nbunch of things in it. 9624 09:19:38,942 --> 09:19:42,122 Prototypes for GetString,\nprototypes for GetInt 9625 09:19:42,122 --> 09:19:44,641 and all of those other\nfunctions, but it turns out 9626 09:19:44,641 --> 09:19:48,841 it also is what defines the\nword "string" in such a way 9627 09:19:48,841 --> 09:19:51,341 that you all can use it\nthese past several weeks. 9628 09:19:51,341 --> 09:19:54,002 So let's take a look at an\nexample of a string in use. 9629 09:19:54,002 --> 09:19:56,042 Here, for instance,\nis a tiny bit of code 9630 09:19:56,042 --> 09:20:00,781 that uses the word "string,"\ncreating a variable called s 9631 09:20:00,781 --> 09:20:03,443 and then storing quote\nunquote, hi, exclamation point. 9632 09:20:03,444 --> 09:20:06,152 Let's consider what this looks\n 9633 09:20:06,152 --> 09:20:08,902 I don't care about all the other\n 9634 09:20:08,902 --> 09:20:11,912 and this per last week is\nhow "hi" might be stored. 9635 09:20:11,911 --> 09:20:14,671 h-i exclamation point and then\none more, as someone already 9636 09:20:14,671 --> 09:20:18,511 observed, that sentinel value--\nthat null character which 9637 09:20:18,512 --> 09:20:21,919 just means eight zero bits to\ndemarcate the end of that string 9638 09:20:21,919 --> 09:20:24,002 just in case there's\nsomething to the right of it 9639 09:20:24,002 --> 09:20:27,162 the computer can now distinguish\none string from another. 9640 09:20:27,161 --> 09:20:30,364 So last week we introduced\nthis new syntax. 9641 09:20:30,364 --> 09:20:32,281 Well, if strings are\njust arrays of characters 9642 09:20:32,281 --> 09:20:35,192 you can then very cleverly use\nthat square bracket notation 9643 09:20:35,192 --> 09:20:39,992 and go to location zero or one\nor two, which are like addresses 9644 09:20:39,991 --> 09:20:41,792 but they're relative to the string. 9645 09:20:41,792 --> 09:20:46,741 This could be at 0x123 or 0x456,\nbut with this bracket notation 9646 09:20:46,741 --> 09:20:49,741 zero is always the beginning\nof the string, one is the next 9647 09:20:49,741 --> 09:20:51,161 two is the next, and so forth. 9648 09:20:51,161 --> 09:20:55,921 So that was our array syntax\nfor indexing into an array. 9649 09:20:55,921 --> 09:20:58,832 But technically speaking, we\ncan go a little deeper today-- 9650 09:20:58,832 --> 09:21:05,101 technically speaking, if hi is\n 9651 09:21:05,101 --> 09:21:11,072 it stands to reason that i is at\n 9652 09:21:14,072 --> 09:21:18,692 Now, I don't care about 123 per se,\n 9653 09:21:19,951 --> 09:21:23,461 Even in hex, if you just add\none when you start at 0x123 9654 09:21:23,461 --> 09:21:25,817 the next number is four,\nfive, six at the end. 9655 09:21:25,817 --> 09:21:27,692 I don't have to worry\nabout A's, B's, and C's 9656 09:21:27,692 --> 09:21:30,702 because I'm not counting\nthat high in this example. 9657 09:21:30,701 --> 09:21:34,891 So if that's the case, and\nmy computer is actually 9658 09:21:34,891 --> 09:21:42,631 laying out the word hi in memory\n 9659 09:21:42,631 --> 09:21:45,362 What exactly is s if,\nat the end of the day 9660 09:21:45,362 --> 09:21:51,391 H-I exclamation point null is storing--\n 9661 09:21:52,366 --> 09:21:54,241 Now that I've taken off\nthose training wheels 9662 09:21:54,241 --> 09:21:57,841 and showed you where H-I\nexclamation point null actually are 9663 09:21:59,582 --> 09:22:03,572 Well s, as always, is\nactually a variable. 9664 09:22:03,572 --> 09:22:05,612 Even in the code I\nproposed a moment ago 9665 09:22:05,612 --> 09:22:08,912 s is apparently a data type\nthat yes, doesn't come with C 9666 09:22:08,911 --> 09:22:11,461 but CS50's library makes it exist. 9667 09:22:11,461 --> 09:22:16,832 s is a variable of type string,\nso where is s in this picture? 9668 09:22:16,832 --> 09:22:20,792 Well, it turns out that\ns might be up here. 9669 09:22:20,792 --> 09:22:24,332 Again, I'm just drawing it anywhere\nfor the sake of discussion 9670 09:22:24,332 --> 09:22:28,502 but s is a variable\nper that line of code. 9671 09:22:28,502 --> 09:22:32,338 What s is storing,\napparently, I claim, is 0x123. 9672 09:22:32,338 --> 09:22:35,671 I actually don't really care about these\n 9673 09:22:35,671 --> 09:22:40,951 s is apparently, as of now, today,\n 9674 09:22:42,122 --> 09:22:44,672 Specifically, the first character in s. 9675 09:22:44,671 --> 09:22:46,771 And this is the last\npiece of the puzzle. 9676 09:22:46,771 --> 09:22:50,341 Last week we had this clever way\n 9677 09:22:50,341 --> 09:22:55,261 Well, it turns out that strings are\n 9678 09:22:55,262 --> 09:22:59,222 as a variable that is a\npointer, inside of which 9679 09:22:59,222 --> 09:23:02,262 is the address of the first\ncharacter in the string. 9680 09:23:02,262 --> 09:23:05,312 So if s points at the\nfirst character and you 9681 09:23:05,311 --> 09:23:07,862 can trust that backslash zero\nis at the end of the string 9682 09:23:07,862 --> 09:23:13,451 that's literally all you need to figure\n 9683 09:23:14,891 --> 09:23:16,502 Well, let's be a little more concrete. 9684 09:23:16,502 --> 09:23:20,162 In terms of this picture, if I've\n 9685 09:23:20,161 --> 09:23:25,322 it turns out all this time since\n 9686 09:23:25,322 --> 09:23:32,232 semi-secretly been an\nalias for char star. 9687 09:23:34,752 --> 09:23:36,201 So why does this make sense? 9688 09:23:36,201 --> 09:23:39,441 It's a little weird still,\nbut if in our previous example 9689 09:23:39,442 --> 09:23:43,031 we were able to store the address of\n 9690 09:23:45,192 --> 09:23:48,042 well, if as of now strings\nare just the address 9691 09:23:48,042 --> 09:23:53,472 of the first character in a string, then\n 9692 09:23:53,472 --> 09:23:57,222 because that means s is the\naddress of a character, the very 9693 09:23:57,222 --> 09:23:58,822 first character in the string. 9694 09:23:58,822 --> 09:24:02,802 Now, the string might have three letters\n 9695 09:24:02,802 --> 09:24:04,932 if it's a long paragraph,\nbut that's fine 9696 09:24:04,932 --> 09:24:06,849 because you can trust\nthat there's going to be 9697 09:24:06,849 --> 09:24:08,542 that null character at the very end. 9698 09:24:08,542 --> 09:24:12,281 So this is a general purpose\nway of representing strings 9699 09:24:12,281 --> 09:24:15,402 using this new mechanism in C. 9700 09:24:15,402 --> 09:24:18,582 So in fact, let me go ahead\nhere and introduce maybe 9701 09:24:18,582 --> 09:24:20,421 a couple of manipulations of this. 9702 09:24:20,421 --> 09:24:24,191 Let me go back to my code here, and\n 9703 09:24:24,192 --> 09:24:27,742 and let's instead now\ndo, for instance, this. 9704 09:24:27,741 --> 09:24:32,743 Let me add in the CS50 library,\nso we'll include CS50.H for now. 9705 09:24:32,743 --> 09:24:34,451 I'm going to go ahead\nand inside of main 9706 09:24:34,451 --> 09:24:37,331 give myself a string s\nequals hi exclamation point. 9707 09:24:37,332 --> 09:24:38,982 I don't type the backslash zero. 9708 09:24:38,982 --> 09:24:43,589 C does that for me automatically by\n 9709 09:24:43,588 --> 09:24:45,171 Now let me just go ahead and print it. 9710 09:24:45,171 --> 09:24:48,341 So this again is week 1 style stuff\n 9711 09:24:49,972 --> 09:24:55,122 So let me do make address, Enter,\n 9712 09:24:56,752 --> 09:25:00,701 But let's start to peel back\nsome of these layers here. 9713 09:25:00,701 --> 09:25:04,721 Let me first of all, get rid of\nthe CS50 library for a moment 9714 09:25:04,722 --> 09:25:09,012 and let me change string to char star. 9715 09:25:09,012 --> 09:25:11,262 And it's a little bit weird\nbut yes, the convention 9716 09:25:11,262 --> 09:25:15,260 is to say char, a space, then the\n 9717 09:25:16,302 --> 09:25:19,052 Strictly speaking though, you might\n 9718 09:25:19,052 --> 09:25:22,032 do it like this or like\nthis, but the canonical way 9719 09:25:22,031 --> 09:25:23,811 is typically to do it like that. 9720 09:25:23,811 --> 09:25:26,671 So now no more CS50 library, no\n 9721 09:25:26,671 --> 09:25:29,182 I'm just treating strings\nfor what they really are. 9722 09:25:29,182 --> 09:25:32,381 Let me go ahead and do\nmake address, Enter-- 9723 09:25:32,381 --> 09:25:34,542 so far so good-- ./address-- 9724 09:25:36,012 --> 09:25:40,211 So %s is a thing that comes with printf\n 9725 09:25:40,211 --> 09:25:44,262 terminology but strictly speaking\n 9726 09:25:44,262 --> 09:25:48,582 It's always been char star,\nso what this means now is I 9727 09:25:48,582 --> 09:25:52,122 can start to have some fun\nwith these basic ideas 9728 09:25:52,122 --> 09:25:55,252 even though this is not purposeful\n 9729 09:25:55,252 --> 09:25:59,262 But if s is this-- let me go back\n 9730 09:25:59,262 --> 09:26:01,752 Let's put those training wheels\nback on for just a moment 9731 09:26:01,752 --> 09:26:04,582 so that I can do one\nmanipulation at a time. 9732 09:26:04,582 --> 09:26:07,492 Here's my string s, as before. 9733 09:26:07,491 --> 09:26:10,542 Well, let me go ahead and\ndeclare a char called c 9734 09:26:10,542 --> 09:26:15,582 and let me store the first character\n 9735 09:26:15,582 --> 09:26:18,252 s bracket zero, and\nthat should give me h. 9736 09:26:18,252 --> 09:26:21,311 And then just for kicks, let\nme go ahead and do char star-- 9737 09:26:21,311 --> 09:26:28,421 whoops-- let me go ahead and do\nchar star p equals ampersand c 9738 09:26:28,421 --> 09:26:30,851 and see what this\nactually prints for me. 9739 09:26:30,851 --> 09:26:34,222 Let me go ahead and\nprint out what p is here. 9740 09:26:34,222 --> 09:26:35,452 So we're just playing around. 9741 09:26:35,451 --> 09:26:39,041 So make address-- so\nfar so good-- ./address. 9742 09:26:39,042 --> 09:26:41,381 All right, so what have I just done? 9743 09:26:41,381 --> 09:26:46,511 I've just created a char c and\nstored in it the letter H, which 9744 09:26:46,512 --> 09:26:50,891 is the same thing as s bracket I, then\n 9745 09:26:50,891 --> 09:26:53,752 and that's apparently 0x7FF whatever. 9746 09:26:55,002 --> 09:26:57,201 But I technically\ndidn't have to do that. 9747 09:26:57,201 --> 09:26:59,002 Let me go ahead and do two things now. 9748 09:26:59,002 --> 09:27:07,362 Instead of just printing p, let me go\n 9749 09:27:07,362 --> 09:27:09,822 Let me go ahead and do\nmake address, Enter-- 9750 09:27:09,822 --> 09:27:12,972 so far so good-- ./address and-- 9751 09:27:12,972 --> 09:27:15,732 damn it, what did I do wrong. 9752 09:27:15,732 --> 09:27:17,562 Oh shoot, I didn't want to do that. 9753 09:27:17,561 --> 09:27:21,141 Oh, I really made a mess of this. 9754 09:27:23,921 --> 09:27:27,191 That was supposed to be impressive\nbut it was the opposite. 9755 09:27:30,682 --> 09:27:34,542 So if I intended to do this,\nwhy are lines nine and 10 9756 09:27:36,822 --> 09:27:40,002 Didn't really intend to go here,\nbut let me try to save this. 9757 09:27:40,002 --> 09:27:47,351 Why are we seeing different addresses,\n 9758 09:27:55,482 --> 09:27:57,932 AUDIENCE: [INAUDIBLE]\nis the character c is 9759 09:27:57,932 --> 09:28:02,832 its own sort of location\nof the [INAUDIBLE] 9760 09:28:02,832 --> 09:28:04,874 and it's taking off just\nthe values [INAUDIBLE].. 9761 09:28:05,874 --> 09:28:08,045 So if I really wanted to\nweasel my way out of this 9762 09:28:08,044 --> 09:28:10,711 this is a great answer to the\nprevious question which was about 9763 09:28:10,711 --> 09:28:15,451 what if I introduce another variable,\n 9764 09:28:15,451 --> 09:28:18,151 and not in this case an\nint, but an actual char. 9765 09:28:18,152 --> 09:28:23,641 Here, I've made c be a copy of the\n 9766 09:28:24,741 --> 09:28:26,491 So if I were to draw\nit on the screen that 9767 09:28:26,491 --> 09:28:30,631 would give me a different\nrectangle in which this copy of h 9768 09:28:32,042 --> 09:28:33,991 So I didn't intend to\ndo this, but what you're 9769 09:28:33,991 --> 09:28:35,978 seeing is yes, the address of s-- 9770 09:28:35,978 --> 09:28:38,311 and apparently that's at a\npretty low address by default 9771 09:28:38,311 --> 09:28:40,322 here-- then you're\nseeing the address of c. 9772 09:28:40,322 --> 09:28:43,201 But even though each\nof them is h, I claim 9773 09:28:43,201 --> 09:28:45,163 one is at a different address in memory. 9774 09:28:45,163 --> 09:28:46,621 And this has always been happening. 9775 09:28:46,622 --> 09:28:49,352 Any time you created one variable\n 9776 09:28:49,351 --> 09:28:51,269 or here, or here, or\nsomewhere else in memory. 9777 09:28:51,269 --> 09:28:54,271 Now for the first time all we're\n 9778 09:28:54,271 --> 09:28:57,731 the computer's memory to\nsee what is actually there. 9779 09:28:57,732 --> 09:29:01,382 So let me actually back\nthis up a little bit 9780 09:29:01,381 --> 09:29:04,752 and do what I intended to do here,\n 9781 09:29:04,752 --> 09:29:08,912 So if string s equals quote\nunquote, hi, let's go ahead 9782 09:29:08,911 --> 09:29:18,411 and give myself a pointer, called\n 9783 09:29:18,411 --> 09:29:22,251 All right, so now let me go ahead and\n 9784 09:29:24,394 --> 09:29:26,311 So we're just going to\ndo one thing at a time. 9785 09:29:26,311 --> 09:29:29,121 So make address, Enter, ./address. 9786 09:29:29,122 --> 09:29:34,222 There, at the moment, is the\n 9787 09:29:34,222 --> 09:29:36,141 What I meant to do now, was this. 9788 09:29:36,141 --> 09:29:39,082 If I want to print out\ntwo things this time 9789 09:29:39,082 --> 09:29:44,752 let me print out not only what p is,\n 9790 09:29:44,752 --> 09:29:48,771 Because if I claim that everyone from\n 9791 09:29:48,771 --> 09:29:51,741 s bracket zero just representing\nthe first character in s 9792 09:29:51,741 --> 09:29:54,981 by definition of strings\nbeing arrays of characters. 9793 09:29:54,982 --> 09:30:01,232 Then s, as of today, is itself\nthe address of a character 9794 09:30:02,122 --> 09:30:06,082 So if I now do make\naddress, and do ./address 9795 09:30:06,082 --> 09:30:08,842 this time I see the same exact things. 9796 09:30:13,588 --> 09:30:16,171 This is really the lamest sort\nof thing to be applauding over 9797 09:30:16,171 --> 09:30:21,932 but what we're demonstrating here is\n 9798 09:30:21,932 --> 09:30:23,622 of the first character in c. 9799 09:30:23,622 --> 09:30:26,292 So if we borrow some of our\nmental model from last week-- 9800 09:30:26,292 --> 09:30:31,171 well, if s bracket zero is the first\n 9801 09:30:31,171 --> 09:30:33,711 that expression should be the same as s. 9802 09:30:33,711 --> 09:30:36,211 Now this isn't to say that we\nwould jump through these hoops 9803 09:30:36,211 --> 09:30:40,411 all the time with this much syntax,\n 9804 09:30:40,411 --> 09:30:46,531 that s is in fact, as I claimed a moment\n 9805 09:30:46,531 --> 09:30:50,012 Not even multiple characters, it's\n 9806 09:30:50,012 --> 09:30:53,942 but the key thing is it's the address\n 9807 09:30:53,942 --> 09:30:57,182 and per last week we\ntrust that C is going 9808 09:30:57,182 --> 09:31:00,241 to look for that null\ncharacter at the very end just 9809 09:31:00,241 --> 09:31:04,081 to make sure it knows where\nthe string actually ends. 9810 09:31:04,082 --> 09:31:07,677 All right, a question came up over here. 9811 09:31:21,942 --> 09:31:25,542 To summarize, on line\neight, when I am using %p-- 9812 09:31:25,542 --> 09:31:28,542 that just means print a pointer\nvalue, so 0x something-- 9813 09:31:30,942 --> 09:31:36,641 Previously, when we used %s, printf knew\n 9814 09:31:36,641 --> 09:31:40,841 of s, but h, i, exclamation point, and\n 9815 09:31:41,982 --> 09:31:47,202 p is different. %p tells the\ncomputer to go to that address-- 9816 09:31:47,201 --> 09:31:52,072 sorry, tells the computer to\nprint that address on the screen. 9817 09:31:52,072 --> 09:31:55,122 So this is where %s all\nthis time has been powerful. 9818 09:31:55,122 --> 09:31:59,322 The reason printf worked\nin week 1 and 2 and 3 9819 09:31:59,322 --> 09:32:02,622 was because printf was designed\nby some human years ago 9820 09:32:02,622 --> 09:32:05,652 to go to the address that's\nbeing passed in-- for instance 9821 09:32:05,652 --> 09:32:07,991 s-- and print out\ncharacter after character 9822 09:32:07,991 --> 09:32:11,652 after character until it sees the\nnull character backslash zero 9823 09:32:13,252 --> 09:32:16,841 So that's-- you're getting a lot\n 9824 09:32:16,841 --> 09:32:19,271 Today we're using\nsomething much simpler, %p 9825 09:32:19,271 --> 09:32:22,572 which just literally prints what s is. 9826 09:32:22,572 --> 09:32:24,311 And the reason we\ndon't do this in week 1 9827 09:32:24,311 --> 09:32:26,381 is just because this\nis like way too much 9828 09:32:26,381 --> 09:32:28,381 to be interesting when\nall you want to print out 9829 09:32:28,381 --> 09:32:29,901 is hi or hello, world, or the like. 9830 09:32:29,902 --> 09:32:31,872 But now what we're\nreally doing is revealing 9831 09:32:31,872 --> 09:32:34,302 what's been going on this whole time. 9832 09:32:34,302 --> 09:32:36,039 And let me make one other example here. 9833 09:32:36,038 --> 09:32:37,871 Let me go ahead and get\nrid of this variable 9834 09:32:37,872 --> 09:32:41,262 here and let me just print out a\n 9835 09:32:41,262 --> 09:32:45,492 I'm going to print out not just s\n 9836 09:32:46,542 --> 09:32:48,432 the address of every character in s. 9837 09:32:48,432 --> 09:32:52,713 So let's get the first letter\nin s and get its address 9838 09:32:52,713 --> 09:32:54,671 and I'm going to do copy\npaste for time's sake 9839 09:32:54,671 --> 09:32:57,881 but not something I would do frequently. 9840 09:32:57,881 --> 09:33:01,394 So let me print out the address of the\n 9841 09:33:01,394 --> 09:33:03,311 the third, and actually\neven the fourth, which 9842 09:33:03,311 --> 09:33:06,682 is the backslash zero, by doing this. 9843 09:33:06,682 --> 09:33:11,292 So when I compiled this program--\nmake address, ./address-- 9844 09:33:11,292 --> 09:33:14,802 I should see two\nidentical values and then 9845 09:33:14,802 --> 09:33:17,292 additional values that\nare one byte away. 9846 09:33:17,292 --> 09:33:22,932 In my diagram a moment ago, my addresses\n 9847 09:33:22,932 --> 09:33:29,201 Now it starts at, by chance,\n0x402004, which is s. 9848 09:33:29,201 --> 09:33:32,741 0x402004 is the same thing\nas s because I'm just 9849 09:33:32,741 --> 09:33:35,351 saying go to the first character\nand then get its address. 9850 09:33:35,351 --> 09:33:36,851 Those are one in the same now. 9851 09:33:36,851 --> 09:33:42,762 And then after that\nis 0x402005, 006, 007 9852 09:33:42,762 --> 09:33:44,542 because that is just like the diagram. 9853 09:33:44,542 --> 09:33:48,342 Go to the i, to the exclamation\n 9854 09:33:48,341 --> 09:33:51,252 So all I'm doing now is using my\nnewfound understanding of what 9855 09:33:51,252 --> 09:33:54,612 ampersand does and what the star\n 9856 09:33:54,612 --> 09:33:57,510 I'm poking around in\nthe computer's memory. 9857 09:33:57,510 --> 09:33:59,052 Just to demonstrate there's no magic. 9858 09:33:59,052 --> 09:34:02,022 It's all there very deliberately\nbecause I or printf or someone 9859 09:34:11,254 --> 09:34:12,921 DAVID J. MALAN: Really good observation. 9860 09:34:12,921 --> 09:34:16,432 So it's indeed the case\nthat hi, unlike 50 9861 09:34:16,432 --> 09:34:21,652 is ending up at a very low address,\n 9862 09:34:21,652 --> 09:34:24,622 That's actually because,\nlong story short, strings 9863 09:34:24,622 --> 09:34:27,592 are often stored in a different\npart of the computer's memory-- 9864 09:34:27,591 --> 09:34:29,691 more on that later\ntoday-- for efficiency. 9865 09:34:29,692 --> 09:34:32,902 There's actually only going to be one\n 9866 09:34:32,902 --> 09:34:36,182 point, and the computer is going to\n 9867 09:34:36,182 --> 09:34:39,112 but other values like\nints and floats and the 9868 09:34:39,112 --> 09:34:41,752 like-- they end up lower\nin memory by convention. 9869 09:34:41,752 --> 09:34:45,002 But a good observation, because\nthat is consistent here. 9870 09:34:45,002 --> 09:34:48,472 All right, so a couple final details\n 9871 09:34:48,472 --> 09:34:54,052 Let me go ahead and claim that\nwe implemented char star-- 9872 09:34:54,052 --> 09:34:56,752 or rather, string as a\nchar star as follows. 9873 09:34:56,752 --> 09:34:59,091 As of last week we\nwere writing this code. 9874 09:34:59,091 --> 09:35:03,322 As of this week, we can now start\n 9875 09:35:03,322 --> 09:35:06,902 specifically, we invented\nin the CS50 library. 9876 09:35:06,902 --> 09:35:10,252 But it turns out you've seen a way\n 9877 09:35:11,991 --> 09:35:16,222 We played around last time with data\n 9878 09:35:16,222 --> 09:35:20,002 and briefly the typedef keyword,\nwhich defines a type for you. 9879 09:35:20,002 --> 09:35:22,012 And if I highlight\nwhat's interesting here 9880 09:35:22,012 --> 09:35:25,702 the way we invented a\nperson data type last time 9881 09:35:25,701 --> 09:35:28,761 was to define a person as having\ntwo variables inside of it-- 9882 09:35:28,762 --> 09:35:33,959 a structure that encapsulates a\nname and encapsulates a number. 9883 09:35:33,959 --> 09:35:37,042 Now even though the syntax is a little\n 9884 09:35:37,042 --> 09:35:43,131 thing, notice that this could be a\n 9885 09:35:43,131 --> 09:35:47,421 If I want to create a type called\n 9886 09:35:47,421 --> 09:35:51,591 then I use typedef to make\nit defined to be char star. 9887 09:35:51,591 --> 09:35:55,311 So this is literally all\nthat has ever been in CS50.h 9888 09:35:55,311 --> 09:35:58,131 in addition to those prototypes\nof functions we've talked about. 9889 09:35:58,131 --> 09:36:01,191 typedef char star string\nis a one-line code 9890 09:36:01,192 --> 09:36:05,919 that brings the word string\nas a data type into existence 9891 09:36:05,919 --> 09:36:07,502 and that's all that's ever been there. 9892 09:36:07,502 --> 09:36:10,641 But the star, the char star,\nis just too much in week 1. 9893 09:36:10,641 --> 09:36:14,031 We wait until this point\nto peel back that layer. 9894 09:36:14,031 --> 09:36:16,521 are any questions, then,\non what a string is? 9895 09:36:16,521 --> 09:36:19,101 What star or the ampersand are doing? 9896 09:36:26,432 --> 09:36:30,031 If that is-- is that why when you\n 9897 09:36:30,031 --> 09:36:34,031 did, or almost did, problems arise. 9898 09:36:34,031 --> 09:36:36,332 And in fact yes, last\nweek we use str compare-- 9899 09:36:36,332 --> 09:36:40,711 STRCMP-- for a very deliberate\n 9900 09:36:40,711 --> 09:36:45,301 accidentally would have compared two\n 9901 09:36:50,574 --> 09:36:53,531 All right, well, before we give\n 9902 09:36:53,531 --> 09:36:54,762 we have lots of pieces of paper. 9903 09:36:54,762 --> 09:36:57,552 If anyone wants to come on up and\n 9904 09:36:57,552 --> 09:36:59,562 if you want to make your own\neight by eight grid of something 9905 09:36:59,561 --> 09:37:02,621 to share with the class if you're\n 9906 09:37:02,622 --> 09:37:05,352 Otherwise, let's take 10 minutes\nand will return after 10. 9907 09:37:05,351 --> 09:37:10,271 All right, so let's come\nback to this question of how 9908 09:37:10,271 --> 09:37:13,241 we can start to use these pointers\n 9909 09:37:14,332 --> 09:37:16,572 The goal ultimately\nnext week is going to be 9910 09:37:16,572 --> 09:37:20,292 to use these addresses to really\n 9911 09:37:20,292 --> 09:37:23,622 structures than just persons,\nlike last week, or candidates 9912 09:37:23,622 --> 09:37:25,422 in the context of an\nelectoral algorithm 9913 09:37:25,421 --> 09:37:28,991 if you will, and actually really use\n 9914 09:37:28,991 --> 09:37:32,051 to represent not just\nimages but maybe videos 9915 09:37:32,052 --> 09:37:34,552 and other two-dimensional\nstructures as well. 9916 09:37:34,552 --> 09:37:36,942 But for now, let's come back\nto this address example 9917 09:37:36,942 --> 09:37:41,922 whittle it down to just a hi initially,\n 9918 09:37:42,822 --> 09:37:45,762 So let me re-add the\nCS50 library just so we 9919 09:37:45,762 --> 09:37:49,391 use our synonym for a moment,\nthat is the word string 9920 09:37:49,391 --> 09:37:51,521 and I'll redefine s as a string. 9921 09:37:51,521 --> 09:37:54,191 And what I didn't mention before\nis that these double quotes 9922 09:37:54,192 --> 09:37:57,042 that you've been using for some\n 9923 09:37:57,042 --> 09:38:00,281 The double quotes are\na clue to the compiler 9924 09:38:00,281 --> 09:38:04,671 that what is between them is in\nfact a string as we now know it 9925 09:38:04,671 --> 09:38:07,932 which means the compiler will\ndo all the work of figuring out 9926 09:38:07,932 --> 09:38:10,692 where to put the h, the\ni, the exclamation point 9927 09:38:10,692 --> 09:38:13,722 and even adding for you\nautomatically a backslash zero. 9928 09:38:13,722 --> 09:38:15,942 And what the compiler\nwill do for you, too 9929 09:38:15,942 --> 09:38:18,822 is figure out what address\nall four of those chars 9930 09:38:18,822 --> 09:38:22,692 ended up at and store it\nfor you in the variable s. 9931 09:38:22,692 --> 09:38:26,891 So that's why it just happens with\n 9932 09:38:26,891 --> 09:38:31,271 or even stars explicitly, but the star\n 9933 09:38:31,271 --> 09:38:33,761 string is just synonymous\nnow with char star. 9934 09:38:33,762 --> 09:38:37,732 It's not really as readable,\nbut it is now the same idea. 9935 09:38:37,732 --> 09:38:40,272 So I'll leave string in place\njust to do something week 9936 09:38:40,271 --> 09:38:43,941 1 style here for a moment, and let's go\n 9937 09:38:43,942 --> 09:38:49,391 So I'm going to use %c this time, and\n 9938 09:38:49,391 --> 09:38:54,521 and then I'm going to print out\ns bracket one and s bracket two 9939 09:38:54,521 --> 09:38:58,451 literally doing week three\nstyle from last week-- 9940 09:38:58,451 --> 09:39:03,281 a printing of every character\nin s as though it were an array. 9941 09:39:03,281 --> 09:39:06,582 So ./address should give\nme h-i exclamation point. 9942 09:39:06,582 --> 09:39:09,822 And if I really want to get\ncurious, technically speaking 9943 09:39:09,822 --> 09:39:14,052 I could print out one more location,\n 9944 09:39:14,052 --> 09:39:19,572 make address ./address and there is,\n 9945 09:39:19,572 --> 09:39:25,002 I'm not seeing zero because I didn't\n 9946 09:39:25,002 --> 09:39:28,692 it's literally eight zero bits\n 9947 09:39:28,692 --> 09:39:30,322 if you will, in printf speak. 9948 09:39:30,322 --> 09:39:32,711 And so what I'm seeing here\nis like a blank symbol. 9949 09:39:32,711 --> 09:39:34,902 That just means there is\nsomething else there-- 9950 09:39:34,902 --> 09:39:39,162 it's apparently all eight\nzero bits, but they are there 9951 09:39:39,161 --> 09:39:41,931 even though we're not seeing\nthem literally right now. 9952 09:39:41,932 --> 09:39:44,572 Well, let's go ahead and\npeel back one of these layers 9953 09:39:44,572 --> 09:39:48,491 and let me go ahead and get rid of\n 9954 09:39:48,491 --> 09:39:51,911 therefore, the word string because\n 9955 09:39:53,262 --> 09:39:56,141 I'm going to now do\nmake address, ./address 9956 09:39:56,141 --> 09:39:57,612 and it's the same exact thing. 9957 09:39:57,612 --> 09:40:00,982 And now, let's just focus on the hi\n 9958 09:40:00,982 --> 09:40:05,772 So I'm going to recompile one last time\n 9959 09:40:05,771 --> 09:40:10,361 Well, it turns out that the\narray notation we used last week 9960 09:40:10,362 --> 09:40:12,972 was technically some of\nthis syntactic sugar. 9961 09:40:12,972 --> 09:40:16,182 Sort of a neat way to use\nsyntax in a useful way 9962 09:40:16,182 --> 09:40:21,792 but we can see more explicitly today\n 9963 09:40:23,421 --> 09:40:25,161 Let me go ahead and do this. 9964 09:40:25,161 --> 09:40:30,401 Let me adventurously say I\nwant to print out not s bracket 9965 09:40:30,402 --> 09:40:36,192 zero, but I want to print out\n 9966 09:40:36,192 --> 09:40:38,442 So to be clear, what is s now? 9967 09:40:38,442 --> 09:40:39,792 It's the address of a string. 9968 09:40:41,292 --> 09:40:44,802 s is the address of the\nfirst char in a string 9969 09:40:44,802 --> 09:40:47,802 and again, that's sufficient for\n 9970 09:40:47,802 --> 09:40:50,722 the computer will see that there's\n 9971 09:40:50,722 --> 09:40:56,601 So s is specifically the address\n 9972 09:40:56,601 --> 09:40:59,652 So that means, using my\nnew syntax, if I want 9973 09:40:59,652 --> 09:41:02,944 to print out that first\ncharacter I can print out star 9974 09:41:02,944 --> 09:41:06,834 s, because recall that star is the\n 9975 09:41:06,834 --> 09:41:09,042 repeat the word char, you\ndon't repeat the word int-- 9976 09:41:10,661 --> 09:41:13,181 That means go to that address. 9977 09:41:13,182 --> 09:41:18,012 Similarly, if I, in my newfound\nknowledge of how strings work 9978 09:41:18,012 --> 09:41:21,641 know that the h comes first,\nthen the i right after it 9979 09:41:21,641 --> 09:41:25,512 then the exclamation point, then\n 9980 09:41:25,512 --> 09:41:29,292 one byte apart, I could\nstart to do some arithmetic. 9981 09:41:29,292 --> 09:41:34,932 I could go to s plus 1 byte and\nprint out the second character 9982 09:41:34,932 --> 09:41:38,682 and I could print out\nwhatever is at s plus 2-- 9983 09:41:38,682 --> 09:41:41,951 in fact, doing what's generally\nknown as pointer arithmetic. 9984 09:41:41,951 --> 09:41:44,951 Literally treating pointers\nas the numbers they are-- 9985 09:41:44,951 --> 09:41:48,191 hexadecimal or decimal, doesn't really\n 9986 09:41:48,192 --> 09:41:51,022 And go ahead and add\none byte or two bytes 9987 09:41:51,021 --> 09:41:53,511 to them to start at the\nbeginning of a string 9988 09:41:53,512 --> 09:41:56,192 and just poke around from left to right. 9989 09:41:56,192 --> 09:42:00,262 So this now is equivalent to what we\n 9990 09:42:00,262 --> 09:42:05,031 notation, but now I'm re implementing\n 9991 09:42:05,031 --> 09:42:09,182 plumbing, understanding ampersand\n 9992 09:42:09,182 --> 09:42:11,961 so if I remake this\nprogram and do ./address 9993 09:42:11,961 --> 09:42:14,489 I should still see\nh-i exclamation point. 9994 09:42:14,489 --> 09:42:16,822 But what I'm really doing is\njust kind of demonstrating 9995 09:42:16,822 --> 09:42:20,211 hopefully, my understanding\nof what really 9996 09:42:20,211 --> 09:42:22,072 is going on in the computer's memory. 9997 09:42:22,072 --> 09:42:24,591 Now, programmers who are\nmaybe trying to show off 9998 09:42:24,591 --> 09:42:25,971 might actually write this syntax. 9999 09:42:25,972 --> 09:42:28,597 I think the more common syntax\nwould be what we did last week-- 10000 09:42:28,597 --> 09:42:30,332 s bracket zero, s bracket one. 10001 09:42:30,832 --> 09:42:32,707 It's just a little more\nreadable and we don't 10002 09:42:32,707 --> 09:42:36,891 need to brag about or care about\nthis underlying representation. 10003 09:42:36,891 --> 09:42:39,771 The square brackets last week\nwe're an abstraction, if you will 10004 09:42:39,771 --> 09:42:42,081 on top of what is lower level math. 10005 09:42:42,082 --> 09:42:44,722 But that's all that's going\non underneath the hood. 10006 09:42:44,722 --> 09:42:48,171 We're poking around from\nbyte to byte to byte. 10007 09:42:48,171 --> 09:42:53,582 All right, let me pause here, see if\n 10008 09:42:56,292 --> 09:42:59,012 Let's do one more then, just\nto demonstrate that this is not 10009 09:43:00,531 --> 09:43:02,521 Let me go ahead and\nget rid of all of this 10010 09:43:02,521 --> 09:43:06,901 and let me give myself an array\nof numbers like I did last week. 10011 09:43:06,902 --> 09:43:09,182 So if I'm going to\ndeclare all the numbers 10012 09:43:09,182 --> 09:43:11,881 at once using this funky\ncurly brace notation 10013 09:43:11,881 --> 09:43:15,331 I can do like 4, 6, 8, 2, 7, 5, 0. 10014 09:43:15,332 --> 09:43:19,412 So seven different numbers inside\n 10015 09:43:20,432 --> 09:43:22,491 I don't, strictly speaking,\nneed to say seven. 10016 09:43:22,491 --> 09:43:24,241 The compiler is smart\nenough to figure out 10017 09:43:24,241 --> 09:43:26,612 how many numbers I put\nwith commas between them 10018 09:43:26,612 --> 09:43:31,112 and that just gives me an array\ncontaining 4, 6, 8, 2, 7, 5, 0. 10019 09:43:31,112 --> 09:43:34,561 So it turns out I can print each of\n 10020 09:43:34,561 --> 09:43:40,381 I can do a printf of %i backslash n,\n 10021 09:43:40,381 --> 09:43:44,401 and let me just do some quick copy/paste\n 10022 09:43:44,402 --> 09:43:49,241 Theoretically, that should\nprint out 4, 6, 8, and so forth. 10023 09:43:49,241 --> 09:43:52,381 But I can do the same sort\nof manipulation understanding 10024 09:43:52,381 --> 09:43:55,292 what pointers now are,\nusing pointer arithmetic. 10025 09:43:55,292 --> 09:43:59,101 So let me actually unwind this\nand just go back to one printf 10026 09:43:59,101 --> 09:44:02,551 and instead of printing numbers bracket\n 10027 09:44:02,552 --> 09:44:06,722 let me just go and print out\nwhatever is at that address-- 10028 09:44:08,792 --> 09:44:11,222 Let me then print out\nthe second digit, which 10029 09:44:11,222 --> 09:44:16,412 is going to be whatever is at numbers\n 10030 09:44:16,411 --> 09:44:20,381 and do whatever is at numbers plus 2,\n 10031 09:44:20,381 --> 09:44:22,621 let me do it four more\ntimes and do what's 10032 09:44:22,622 --> 09:44:27,242 at location three, four, five, and six. 10033 09:44:27,241 --> 09:44:30,991 And that's seven total numbers\n 10034 09:44:30,991 --> 09:44:32,561 So let me just quickly run this. 10035 09:44:35,012 --> 09:44:37,742 There are those seven\ndigits being printed. 10036 09:44:37,741 --> 09:44:41,761 But there's something\nsubtle but also useful here. 10037 09:44:45,752 --> 09:44:47,891 Because I made an array of integers. 10038 09:44:47,891 --> 09:44:52,542 But think back-- how big is a\ntypical integer, have we claimed? 10039 09:44:52,542 --> 09:44:58,182 Four bytes, or 32 bits, so it's\nworth noting that I don't really 10040 09:44:58,182 --> 09:45:00,201 need to worry about that detail. 10041 09:45:00,201 --> 09:45:05,479 Notice that I did not do plus 4,\n 10042 09:45:05,480 --> 09:45:07,272 I, the programmer,\nstrictly speaking, don't 10043 09:45:07,271 --> 09:45:09,551 need to worry about how\nbig the data type is. 10044 09:45:09,552 --> 09:45:11,652 This is the power of pointer arithmetic. 10045 09:45:11,652 --> 09:45:17,292 The compiler is smart enough to know\n 10046 09:45:17,292 --> 09:45:21,802 that is the same as saying\ngo one more piece of data-- 10047 09:45:22,841 --> 09:45:24,611 so if it's an int, move four. 10048 09:45:24,612 --> 09:45:26,232 If it's a second int, move eight. 10049 09:45:26,232 --> 09:45:27,962 If it's a third int, move 12. 10050 09:45:27,961 --> 09:45:31,182 Pointer arithmetic handles that\nannoying arithmetic for you 10051 09:45:31,182 --> 09:45:33,822 so you can just think of this\nas a number after a number 10052 09:45:33,822 --> 09:45:37,182 after a number that are back to\n 10053 09:45:38,531 --> 09:45:42,561 Which is only to say plus 1, plus 2,\n 10054 09:45:43,061 --> 09:45:48,481 Because the compiler knows what\n 10055 09:45:48,482 --> 09:45:51,872 Now, there's one other\ndetail I should reveal here 10056 09:45:51,872 --> 09:45:54,032 that I've taken for granted. 10057 09:45:54,031 --> 09:45:57,002 In the past I was using double\nquotes to represent strings 10058 09:45:57,002 --> 09:45:59,732 and I claim that the compiler's\nsmart enough to realize that oh 10059 09:45:59,732 --> 09:46:04,272 if I have double quote hi, that means\n 10060 09:46:04,271 --> 09:46:05,791 and then the backslash zero. 10061 09:46:08,161 --> 09:46:13,921 It turns out that you can actually treat\n 10062 09:46:13,921 --> 09:46:16,141 is itself a pointer,\nand this is actually 10063 09:46:16,141 --> 09:46:18,512 going to be something\nuseful in upcoming problems 10064 09:46:18,512 --> 09:46:22,082 when we want to pass arrays\naround in the computer's memory. 10065 09:46:22,082 --> 09:46:25,824 Notice that strictly speaking on line\n 10066 09:46:25,824 --> 09:46:27,781 There's no star, there's\nno ampersand-- there's 10067 09:46:27,781 --> 09:46:31,021 nothing new there, and yet\ninstantly on line seven 10068 09:46:31,021 --> 09:46:35,851 I'm pretending that it is the\naddress, and this is actually OK. 10069 09:46:35,851 --> 09:46:39,752 It turns out that an array\nreally can be treated 10070 09:46:39,752 --> 09:46:43,241 as the address of the first\nelement in that array. 10071 09:46:43,241 --> 09:46:47,439 The difference is that there's no\n 10072 09:46:47,440 --> 09:46:49,232 This is just part of\nthe phone number here 10073 09:46:49,232 --> 09:46:52,052 the ending in zero-- that's not\nlike a special backslash zero. 10074 09:46:52,052 --> 09:46:55,082 So this is something we're going to\n 10075 09:46:55,082 --> 09:46:58,802 There's this interrelationship\nbetween addresses and arrays 10076 09:46:58,802 --> 09:47:03,482 that just generally allows you to\n 10077 09:47:03,482 --> 09:47:05,882 but the math is taken care of for you. 10078 09:47:05,881 --> 09:47:10,322 Are any questions then on this before\n 10079 09:47:19,144 --> 09:47:20,311 DAVID J. MALAN: Potentially. 10080 09:47:20,311 --> 09:47:24,271 If you go beyond the end of an array,\n 10081 09:47:24,271 --> 09:47:27,541 The problem is that that symptom\nis sometimes nondeterministic 10082 09:47:27,542 --> 09:47:30,542 which means that sometimes it\nwill happen, sometimes it won't. 10083 09:47:30,542 --> 09:47:34,502 It often depends on how far off the\n 10084 09:47:34,502 --> 09:47:36,991 You'll often not induce\nthe segmentation fault 10085 09:47:36,991 --> 09:47:39,781 if you just poke a little too\nfar, but if you go way too far 10086 09:47:41,192 --> 09:47:44,522 But we'll give you a tool today\n 10087 09:47:44,521 --> 09:47:46,541 exactly that kind of situation. 10088 09:47:46,542 --> 09:47:49,452 So let's go ahead now and do\n 10089 09:47:49,451 --> 09:47:51,961 but that actually comes back\nto that spoiler from earlier. 10090 09:47:51,961 --> 09:47:56,832 Let me go ahead and create a program\n 10091 09:47:56,832 --> 09:48:00,002 I'm going to go ahead and\nallow myself the CS50 library 10092 09:48:00,002 --> 09:48:03,482 not so much for string but so that\n 10093 09:48:03,482 --> 09:48:07,801 which is way easier than the way we'll\n 10094 09:48:07,800 --> 09:48:10,832 Let me give myself stdio.h,\ndo an int main(void) 10095 09:48:10,832 --> 09:48:13,742 not worrying about command line\n 10096 09:48:13,741 --> 09:48:18,061 and get an int i using get int, and\n 10097 09:48:18,061 --> 09:48:23,822 then let me give myself an int j, ask\n 10098 09:48:23,822 --> 09:48:27,991 and then let me go ahead and kind of\n 10099 09:48:27,991 --> 09:48:31,411 if i equals equals j,\nthen let's go ahead 10100 09:48:31,411 --> 09:48:36,481 and print out something like "same,"\n 10101 09:48:36,482 --> 09:48:40,152 and print out "different" if\nthey are not, in fact, the same. 10102 09:48:40,152 --> 09:48:44,312 So that would seem to be a program that\n 10103 09:48:44,311 --> 09:48:46,621 All right, so let's go\nahead and run make compare-- 10104 09:48:48,811 --> 09:48:52,351 OK, i will be 50, j will be 50-- 10105 09:48:54,582 --> 09:48:57,599 i will be 50, j will be 42. 10106 09:48:58,391 --> 09:49:02,701 So so far, so good in this\nfirst version of comparison. 10107 09:49:02,701 --> 09:49:05,771 But as you might see\nwhere I'm going with this 10108 09:49:05,771 --> 09:49:09,511 let's move away from integers and let's\n 10109 09:49:10,661 --> 09:49:13,261 So I could do string s over here-- 10110 09:49:15,841 --> 09:49:22,711 Then I could do string t over\nhere, and GetString over here 10111 09:49:22,711 --> 09:49:25,442 asking the user for t this time, here. 10112 09:49:25,442 --> 09:49:26,972 And then I can compare the two. 10113 09:49:28,819 --> 09:49:30,152 and this is a common convention. 10114 09:49:30,152 --> 09:49:33,182 If you've used s for string already you\n 10115 09:49:33,182 --> 09:49:34,802 for simple demonstrations like this. 10116 09:49:34,802 --> 09:49:37,927 I'm going to compare the two, just like\n 10117 09:49:37,927 --> 09:49:41,882 Make compare-- so far\nso good-- ./address-- 10118 09:49:44,582 --> 09:49:47,792 Let me go ahead and\ntype in something like 10119 09:49:47,792 --> 09:49:52,762 hi, exclamation point and bye,\n 10120 09:49:52,762 --> 09:49:54,662 should definitely be different. 10121 09:49:54,661 --> 09:50:00,481 Let me run it again with hi, exclamation\n 10122 09:50:00,482 --> 09:50:02,432 Different-- maybe I messed up. 10123 09:50:02,432 --> 09:50:05,542 Let's maybe do it lowercase,\nmaybe that'll fix. 10124 09:50:05,542 --> 09:50:07,862 But no, those two are different. 10125 09:50:07,862 --> 09:50:11,842 So to come back to what I described\nas a spoiler earlier, what's 10126 09:50:11,841 --> 09:50:16,019 the fundamental issue here, to be clear? 10127 09:50:16,019 --> 09:50:18,061 Why is it saying different\neven though I'm pretty 10128 09:50:18,061 --> 09:50:19,478 sure I typed the same thing twice. 10129 09:50:21,542 --> 09:50:24,961 Yeah, this is where it's now\nuseful to know that string has been 10130 09:50:24,961 --> 09:50:28,423 an abstraction-- a training wheel, if\n 10131 09:50:28,423 --> 09:50:30,631 still use GetString because\nthat's convenient still-- 10132 09:50:30,631 --> 09:50:33,421 but if I change string\nto be char star, it's 10133 09:50:33,421 --> 09:50:39,661 a little more explicit as to what s and\n 10134 09:50:39,661 --> 09:50:42,121 that is the address of\na char. t is a pointer 10135 09:50:42,122 --> 09:50:44,282 to a char, that is\nthe address of a char. 10136 09:50:44,281 --> 09:50:47,432 Specifically, the first character\nin s and the first character 10137 09:50:49,211 --> 09:50:51,436 So if I'm comparing\nthese two it should stand 10138 09:50:51,436 --> 09:50:53,311 to reason that they're\ngoing to be different. 10139 09:50:53,811 --> 09:50:57,421 Because s might end up here in memory\n 10140 09:50:57,421 --> 09:51:00,542 Each time I call GetString, it is\n 10141 09:51:00,542 --> 09:51:02,531 to know that, wait a minute--\nyou typed the same thing. 10142 09:51:02,531 --> 09:51:04,051 I'm just going to hand\nyou back the same address. 10143 09:51:04,052 --> 09:51:06,872 That doesn't happen because we\n 10144 09:51:06,872 --> 09:51:10,502 Each time I call GetString,\nit returns, apparently 10145 09:51:10,502 --> 09:51:13,262 a different copy of the\nstring that was typed in. 10146 09:51:13,262 --> 09:51:15,572 A hi over here and a hi over here. 10147 09:51:15,572 --> 09:51:18,152 They might look the same to\nthe human but to the computer 10148 09:51:18,152 --> 09:51:22,052 they are different chunks of memory,\n 10149 09:51:22,052 --> 09:51:25,542 And here, too, we can reveal\nwhat is GetString returning? 10150 09:51:25,542 --> 09:51:29,522 Well, up until today it was\nreturning a string, so to speak. 10151 09:51:31,021 --> 09:51:33,361 Technically, what\nGetString has always been 10152 09:51:33,362 --> 09:51:38,732 doing is returning the address\nof the first char in a string 10153 09:51:38,732 --> 09:51:42,542 and trusting that we put a backslash\n 10154 09:51:42,542 --> 09:51:46,772 typed in, and that's enough now\nfor printf, for strlen, for you 10155 09:51:46,771 --> 09:51:49,322 to know where a string begins and ends. 10156 09:51:49,322 --> 09:51:53,072 So GetString has actually\nalways returned a pointer. 10157 09:51:53,072 --> 09:51:56,461 It has not returned a quote\nunquote string per se 10158 09:51:56,461 --> 09:51:59,762 but there are functions that can\nsolve this comparison for us. 10159 09:51:59,762 --> 09:52:02,862 Recall that I could do\nsomething like this. 10160 09:52:02,862 --> 09:52:05,792 I could actually go\nin here and I could-- 10161 09:52:07,002 --> 09:52:14,341 So if I include str compare here and\n 10162 09:52:14,341 --> 09:52:18,061 let's see now what happens\nwhen I make compare. 10163 09:52:18,061 --> 09:52:21,572 Implicitly declaring library\n 10164 09:52:22,682 --> 09:52:26,162 So you might have seen this error before\n 10165 09:52:26,161 --> 09:52:30,641 but there's some evidence of\nstars or pointers going on here. 10166 09:52:30,641 --> 09:52:33,131 It looks like I didn't include\nthe string.h header file 10167 09:52:34,322 --> 09:52:38,911 Include string.h which, despite its\n 10168 09:52:38,911 --> 09:52:41,791 called string, it just has\nstring-related functions in it 10169 09:52:46,591 --> 09:52:50,371 Now let's type in hi, exclamation\n 10170 09:52:50,372 --> 09:52:54,002 These are now-- oh, I used it wrong. 10171 09:52:55,724 --> 09:52:58,141 That was supposed to be\nimpressive, but it's the opposite. 10172 09:53:07,618 --> 09:53:09,951 DAVID J. MALAN: Yeah, it\nreturns three different values. 10173 09:53:09,951 --> 09:53:13,731 Zero if they're the same, positive\n1 becomes before the other 10174 09:53:13,732 --> 09:53:15,422 negative if the opposite is true. 10175 09:53:15,421 --> 09:53:18,621 I just forgot that, so like\nI did last week correctly 10176 09:53:18,622 --> 09:53:22,102 if I want to compare them for\nequality per the manual page 10177 09:53:22,101 --> 09:53:24,781 I should be checking for\nzero as the return value. 10178 09:53:24,781 --> 09:53:27,951 Now make compare, ./compare, Enter. 10179 09:53:27,951 --> 09:53:30,621 Let's try it one last time-- hi and hi. 10180 09:53:30,622 --> 09:53:32,182 OK now, they're in fact the same. 10181 09:53:37,232 --> 09:53:40,112 And indeed, not that it's\nreturning same all the time. 10182 09:53:40,112 --> 09:53:42,332 If I type in hi and\nthen bye, it's indeed 10183 09:53:42,332 --> 09:53:44,622 noticing that difference as well. 10184 09:53:44,622 --> 09:53:48,612 Well, let me go ahead and\ndo one other thing here. 10185 09:53:50,862 --> 09:53:54,362 Let me go ahead now and just reveal\n 10186 09:53:54,362 --> 09:53:57,692 Let's get rid of the string comparison\n 10187 09:53:57,692 --> 09:54:01,472 The simple way to print this out would\n 10188 09:54:02,521 --> 09:54:05,701 taking an address and start\nthere, print every character up 10189 09:54:05,701 --> 09:54:09,101 until the backslash n, so let's\njust hand it s and do that. 10190 09:54:09,101 --> 09:54:12,271 And then let's do one more, %s,t. 10191 09:54:12,271 --> 09:54:17,111 This is, again, sort of a\nmix of week 1 and this week 10192 09:54:17,112 --> 09:54:18,932 because I got rid of the word string. 10193 09:54:18,932 --> 09:54:24,072 I'm using char star, but I'm still\n 10194 09:54:24,072 --> 09:54:27,692 Let me go ahead and run compare\nnow, and if I type hi and hi 10195 09:54:27,692 --> 09:54:29,652 I should see the same thing twice. 10196 09:54:29,652 --> 09:54:33,272 So they look the same, but here\nnow we have the syntax today 10197 09:54:33,271 --> 09:54:35,651 to print out the actual\naddresses of these things. 10198 09:54:35,652 --> 09:54:40,082 So let me just change the s to a p,\n 10199 09:54:40,082 --> 09:54:44,012 and print it, it means just\nprint the address as a pointer. 10200 09:54:44,012 --> 09:54:48,781 So make compare, ./compare, and now\n 10201 09:54:48,781 --> 09:54:53,192 and I should see, indeed, two\nslightly different addresses given 10202 09:54:54,002 --> 09:54:56,311 One's got a B at the end,\none's got an F at the end 10203 09:54:56,311 --> 09:54:58,841 and they are indeed a few bytes apart. 10204 09:54:58,841 --> 09:55:02,066 So this is just confirming what\n 10205 09:55:02,067 --> 09:55:04,442 So what does this mean, perhaps\nin the computer's memory? 10206 09:55:05,942 --> 09:55:09,872 I've zoomed out so I have a little\n 10207 09:55:09,872 --> 09:55:16,262 Here might be s in memory when I do\n 10208 09:55:16,262 --> 09:55:19,742 I get a variable that's of size\n 10209 09:55:19,741 --> 09:55:23,311 claimed earlier that on modern systems,\n 10210 09:55:23,311 --> 09:55:25,621 nowadays so they can count even higher. 10211 09:55:25,622 --> 09:55:28,607 And inside of the computer's\nmemory, also, might be hi. 10212 09:55:28,607 --> 09:55:31,232 And I don't know where it ends\nup so for the sake of discussion 10213 09:55:32,161 --> 09:55:35,121 That's what was free\nwhen I ran the program. 10214 09:55:35,122 --> 09:55:36,961 h-i exclamation point, backslash zero. 10215 09:55:36,961 --> 09:55:42,122 Maybe it ended up, for the sake of\n 10216 09:55:42,122 --> 09:55:47,162 So to be clear, what is s\nstoring once the assignment 10217 09:55:47,161 --> 09:55:50,072 operator copies from right to left? 10218 09:55:50,072 --> 09:55:54,692 What is s storing if I\nadvance one more slide? 10219 09:55:56,811 --> 09:56:00,621 0x123, the presumption\nbeing that if a string is 10220 09:56:00,622 --> 09:56:04,597 defined by the address of its first\n 10221 09:56:04,597 --> 09:56:09,052 is 0x123, then that's indeed\nwhat should be in the variable s. 10222 09:56:09,052 --> 09:56:12,112 And so technically, that's what's\n 10223 09:56:13,612 --> 09:56:16,762 GetString indeed returns\na string, so to speak 10224 09:56:16,762 --> 09:56:20,601 but more properly it returns\nthe address of a char. 10225 09:56:20,601 --> 09:56:24,082 What's been then copied from right to\n 10226 09:56:24,082 --> 09:56:26,961 all these weeks is indeed that address. 10227 09:56:26,961 --> 09:56:31,461 Now technically, we don't really need\n 10228 09:56:31,461 --> 09:56:34,311 It suffices to just think about\nthem referentially, but let's 10229 09:56:34,311 --> 09:56:38,151 first consider where t might be.\n 10230 09:56:38,152 --> 09:56:39,802 created on my second line of code. 10231 09:56:39,802 --> 09:56:41,422 Maybe it ends up there,\nmaybe somewhere else. 10232 09:56:41,421 --> 09:56:43,713 For the sake of discussion\nI'll draw it left and right. 10233 09:56:43,713 --> 09:56:47,131 Where did the second word\nend up that I typed in? 10234 09:56:47,131 --> 09:56:53,031 Well, suppose the second copy of\nhi ended up at 0x456457458459. 10235 09:56:54,322 --> 09:56:55,911 I'll pluck this one off myself. 10236 09:56:57,982 --> 09:57:01,432 And so this is now a pictorial\nrepresentation of why 10237 09:57:01,432 --> 09:57:03,112 and let's abstract away everything else. 10238 09:57:03,112 --> 09:57:08,421 When I compared s against t using\n 10239 09:57:08,421 --> 09:57:09,951 they're obviously not the same. 10240 09:57:09,951 --> 09:57:12,112 One is over here, one is over here. 10241 09:57:12,112 --> 09:57:16,641 And per a moment ago, one is\n0x123, the other is 0x456. 10242 09:57:16,641 --> 09:57:19,851 Yes, technically they're pointing\nat something that's the same 10243 09:57:19,851 --> 09:57:23,332 but that just reveals\nhow str compare works. 10244 09:57:23,332 --> 09:57:26,002 str compare is apparently\na function that 10245 09:57:26,002 --> 09:57:29,241 takes in the address of\na string as its argument 10246 09:57:29,241 --> 09:57:31,761 and the address of another\nstring as its argument 10247 09:57:31,762 --> 09:57:36,682 it goes to the first character in\n 10248 09:57:36,682 --> 09:57:38,872 and probably has a for\nloop or a while loop 10249 09:57:38,872 --> 09:57:41,782 and just goes from left to\nright, comparing, looking 10250 09:57:41,781 --> 09:57:45,502 for the same chars left and right, and\n 10251 09:57:47,482 --> 09:57:51,842 If it does notice a difference it\n 10252 09:57:51,841 --> 09:57:55,681 And that's very similar, recall, to how\n 10253 09:57:56,182 --> 09:57:59,092 I used a for loop, I was\nlooking for a backslash zero. 10254 09:57:59,091 --> 09:58:04,881 str compare is probably a little similar\n 10255 09:58:04,881 --> 09:58:08,362 but comparing, this\ntime not just counting. 10256 09:58:08,362 --> 09:58:11,092 Are any questions then,\non string comparison 10257 09:58:11,091 --> 09:58:14,181 and why it is that we use str\ncompare and not equals equals? 10258 09:58:15,374 --> 09:58:17,610 AUDIENCE: Do pointers have addresses? 10259 09:58:17,610 --> 09:58:19,402 DAVID J. MALAN: Do\npointers have addresses? 10260 09:58:19,902 --> 09:58:24,652 So we won't do that today, but I could\n 10261 09:58:26,182 --> 09:58:29,781 That would give me the\nequivalent of a char star star 10262 09:58:29,781 --> 09:58:31,966 that itself could be\nstored elsewhere in memory. 10263 09:58:32,841 --> 09:58:35,031 We don't do that recursively forever. 10264 09:58:35,031 --> 09:58:37,972 There's star and there's star\nstar, but yes, that is a thing 10265 09:58:37,972 --> 09:58:41,272 and it's very often useful in the\n 10266 09:58:41,271 --> 09:58:44,541 which we haven't really talked about,\n 10267 09:58:47,582 --> 09:58:50,632 All right, so what might we now\ndo to take things up a notch? 10268 09:58:50,631 --> 09:58:53,151 Well let's go ahead and implement\na different program here 10269 09:58:53,152 --> 09:58:56,702 that maybe tries copying some\nvalues, just to demonstrate this. 10270 09:58:56,701 --> 09:59:00,441 Let me open up a file\ncalled, how about copy.c 10271 09:59:00,442 --> 09:59:02,872 and I'm going to start\noff with a few includes. 10272 09:59:02,872 --> 09:59:06,652 So let's include the CS50 library just\n 10273 09:59:06,652 --> 09:59:11,302 Let's include-- how about stdio\nas always, let's preemptively 10274 09:59:11,302 --> 09:59:14,072 include string.h and maybe\none other in a moment. 10275 09:59:14,072 --> 09:59:17,072 Let's do int main(void) as before. 10276 09:59:17,072 --> 09:59:20,601 And then in here, let's get a\nstring from the user and just 10277 09:59:23,031 --> 09:59:26,722 And heck, we can actually just\ncall this char star if we want 10278 09:59:26,722 --> 09:59:28,834 or string, since we're\nusing the RS50 library. 10279 09:59:28,834 --> 09:59:30,002 But we'll come back to that. 10280 09:59:30,002 --> 09:59:33,591 Let's now make a copy\nof s and do s equals t 10281 09:59:33,591 --> 09:59:38,252 using a single assignment operator and\n 10282 09:59:38,252 --> 09:59:43,192 Let's go into the first character\nof t, which is t bracket zero 10283 09:59:43,192 --> 09:59:45,592 and then let's uppercase\nit using that function 10284 09:59:45,591 --> 09:59:50,931 that we've used in the past of\n 10285 09:59:50,932 --> 09:59:52,592 And actually, I should go back up here. 10286 09:59:52,591 --> 09:59:56,828 If I'm using toupper or if you use\n 10287 09:59:56,828 --> 09:59:59,661 I might not remember this offhand,\n 10288 10:00:01,521 --> 10:00:04,651 There was a bunch of helpful\nfunctions in that library as well. 10289 10:00:04,652 --> 10:00:09,457 Now at the very last line of the program\n 10290 10:00:09,457 --> 10:00:16,882 are by simply printing out %s for each\n 10291 10:00:16,881 --> 10:00:20,042 of course, and let's\nsee what happens here. 10292 10:00:21,832 --> 10:00:23,242 oh my God, so many mistakes. 10293 10:00:26,661 --> 10:00:30,211 String t equals s, sorry, so\nI'm creating two variables 10294 10:00:30,211 --> 10:00:33,141 s and t respectively,\nand I'm copying s into t. 10295 10:00:34,822 --> 10:00:40,012 There we go. ./copy, and let's\nnow type in, for instance 10296 10:00:40,012 --> 10:00:43,882 how about hi exclamation point\nin all lowercase this time 10297 10:00:47,451 --> 10:00:51,561 I don't think that's what I\nintended, so to speak, here. 10298 10:00:51,561 --> 10:00:55,381 Because notice that I got s from\nthe user, so that checks out. 10299 10:00:55,381 --> 10:00:59,063 I then copied t into\ns, which looks correct. 10300 10:00:59,063 --> 10:01:00,771 That's what we always\nuse assignment for. 10301 10:01:00,771 --> 10:01:04,551 Then I uppercase the first\nletter in t, but not s-- 10302 10:01:05,692 --> 10:01:09,412 then I printed s and t and then\nnoticed, apparently, both s 10303 10:01:13,281 --> 10:01:15,881 So if you're starting to get a\nlittle comfortable with what's 10304 10:01:15,881 --> 10:01:19,781 going on underneath the hood,\n 10305 10:01:19,781 --> 10:01:23,584 Why did both get capitalized? 10306 10:01:23,584 --> 10:01:24,792 Why did both get capitalized? 10307 10:01:25,482 --> 10:01:27,962 AUDIENCE: Could it be they're\nreferencing the same address? 10308 10:01:27,961 --> 10:01:29,372 DAVID J. MALAN: Yeah, they're\nrepresenting the same address. 10309 10:01:31,232 --> 10:01:34,622 If you create another variable called\n 10310 10:01:34,622 --> 10:01:37,232 you are literally assigning\nit the value in s 10311 10:01:37,232 --> 10:01:40,122 which is 0x123 or something like that. 10312 10:01:40,122 --> 10:01:43,742 And so at that point in the\nstory both s and t presumably 10313 10:01:43,741 --> 10:01:47,311 have a value of 0x123,\nwhich means they technically 10314 10:01:47,311 --> 10:01:51,421 point to the same h-i\nexclamation point in memory. 10315 10:01:51,421 --> 10:01:56,252 Nowhere did I tell the computer to give\n 10316 10:01:56,252 --> 10:01:59,491 per se, I literally said just copy s. 10317 10:01:59,491 --> 10:02:03,752 So here's where an understanding of what\n 10318 10:02:03,752 --> 10:02:06,122 I'm only copying the pointers. 10319 10:02:06,122 --> 10:02:07,961 So what actually went on in memory? 10320 10:02:07,961 --> 10:02:09,601 Let's take a look here at this grid. 10321 10:02:09,601 --> 10:02:12,451 If I created s initially,\nmaybe it ends up here. 10322 10:02:12,451 --> 10:02:15,961 And I created hi in lowercase,\nand it ended up down here. 10323 10:02:15,961 --> 10:02:22,112 Then the address was, again, like\n 10324 10:02:22,112 --> 10:02:24,811 If then I create a\nsecond variable called t 10325 10:02:24,811 --> 10:02:29,042 and I call it a string, a.k.a. char\n 10326 10:02:29,042 --> 10:02:34,622 But when I copy s into t by\ndoing t equals s semicolon 10327 10:02:34,622 --> 10:02:40,227 that literally just copies s into\n 10328 10:02:40,226 --> 10:02:43,351 So if we now abstract away all these\n 10329 10:02:43,351 --> 10:02:47,731 with arrows, what we've drawn in\nthe computer's memory is this. 10330 10:02:47,732 --> 10:02:52,232 Two different pointers but storing\nthe same address, which means 10331 10:02:52,232 --> 10:02:55,122 the breadcrumbs lead to the same place. 10332 10:02:55,122 --> 10:02:58,202 And so if you follow the t breadcrumb\n 10333 10:02:58,201 --> 10:03:02,191 it is functionally the\nsame as copying the-- 10334 10:03:02,192 --> 10:03:07,832 changing the first letter\nin the version s as well. 10335 10:03:07,832 --> 10:03:12,671 So what's the solution, then,\nto this kind of problem? 10336 10:03:12,671 --> 10:03:14,741 Even if you have no idea\nhow to do it in code 10337 10:03:14,741 --> 10:03:17,307 what's the gist of what I\nreally intended, which is 10338 10:03:17,307 --> 10:03:21,461 I want a genuine copy of s, called t. 10339 10:03:21,461 --> 10:03:25,574 I want a new h-i exclamation\npoint backslash zero. 10340 10:03:25,574 --> 10:03:27,281 What do I need to do\nto make that happen? 10341 10:03:28,249 --> 10:03:30,992 AUDIENCE: I think there's\na function called str copy. 10342 10:03:30,991 --> 10:03:34,322 DAVID J. MALAN: So there is a\nfunction called str copy, strcpy 10343 10:03:34,322 --> 10:03:36,872 which is a possible\nanswer to this question. 10344 10:03:36,872 --> 10:03:41,042 The catch with stir copy is that you\n 10345 10:03:41,042 --> 10:03:43,592 what the source string is--\nthe one you want to copy-- 10346 10:03:43,591 --> 10:03:46,322 you also need to pass in the\naddress of a chunk of memory 10347 10:03:46,322 --> 10:03:50,911 into which you can copy the string, and\n 10348 10:03:50,911 --> 10:03:53,311 and we need one more building\nblock today, if you will. 10349 10:03:53,311 --> 10:03:57,722 We haven't yet seen a way to\ncreate new chunks of memory 10350 10:03:57,722 --> 10:04:00,641 and then let some other\nfunction copy into them. 10351 10:04:00,641 --> 10:04:04,021 And for this, we're going to introduce\n 10352 10:04:04,932 --> 10:04:07,652 And this is the last and most\npowerful feature perhaps, today 10353 10:04:07,652 --> 10:04:11,612 whereby we're going to introduce two\n 10354 10:04:11,612 --> 10:04:14,851 malloc means memory allocate,\nwhich literally does just that. 10355 10:04:14,851 --> 10:04:18,002 It's a function that takes a number\n 10356 10:04:18,002 --> 10:04:21,394 do you want the operating system to\n 10357 10:04:21,394 --> 10:04:23,311 It's going to find it\nand it's going to return 10358 10:04:23,311 --> 10:04:26,915 to you the address of the first byte of\n 10359 10:04:26,915 --> 10:04:29,582 and then you can do anything you\nwant with that chunk of memory. 10360 10:04:29,582 --> 10:04:31,112 free is going to do the opposite. 10361 10:04:31,112 --> 10:04:33,932 When you're done using a chunk of\n 10362 10:04:33,932 --> 10:04:37,561 you can say free it, and that means you\n 10363 10:04:37,561 --> 10:04:40,781 and then the operating system can\n 10364 10:04:40,781 --> 10:04:44,222 So this is actually evidence of\na common problem in programming. 10365 10:04:44,222 --> 10:04:48,671 If your Mac your PC has ever been in\n 10366 10:04:48,671 --> 10:04:53,281 really slow, or it's slowing to a\n 10367 10:04:53,281 --> 10:04:56,281 one of the possible\nexplanations could be 10368 10:04:56,281 --> 10:04:59,161 that the program you're\nrunning by Apple or Microsoft 10369 10:04:59,161 --> 10:05:02,401 or whoever, maybe they're using\nmalloc or some equivalent 10370 10:05:02,402 --> 10:05:03,707 asking the operating system-- 10371 10:05:03,707 --> 10:05:05,582 Mac OS or Windows-- for,\ngive me more memory. 10372 10:05:06,362 --> 10:05:07,741 The user is creating more images. 10373 10:05:07,741 --> 10:05:09,182 The user is typing a longer essay. 10374 10:05:09,182 --> 10:05:10,802 Give me more memory, more memory. 10375 10:05:10,802 --> 10:05:15,362 If the program has a bug and never\n 10376 10:05:15,362 --> 10:05:18,061 your computer might end up using\nall of the available memory 10377 10:05:18,061 --> 10:05:21,932 and honestly, humans are not very good\n 10378 10:05:21,932 --> 10:05:24,811 Very often programs, computers\njust freeze at that point 10379 10:05:24,811 --> 10:05:28,951 or get really, really slow because\n 10380 10:05:28,951 --> 10:05:31,112 when there's not enough memory left. 10381 10:05:31,112 --> 10:05:33,722 So one of the reasons for a\ncomputer really slowing down 10382 10:05:33,722 --> 10:05:37,995 might be calling for malloc a lot, or\n 10383 10:05:37,995 --> 10:05:40,412 Which is to say, you should\nalways use these two functions 10384 10:05:40,411 --> 10:05:43,991 in concert and free memory\nonce you are done with it. 10385 10:05:43,991 --> 10:05:48,121 So let me go ahead and do this in\n 10386 10:05:48,122 --> 10:05:50,162 Let me go ahead and do this. 10387 10:05:50,161 --> 10:05:53,851 Before I copy s into t using\nsomething like str copy 10388 10:05:53,851 --> 10:05:56,487 I first need to get a bunch\nof memory from the computer. 10389 10:05:56,487 --> 10:05:59,612 So to do that, let's make this super\n 10390 10:05:59,612 --> 10:06:03,182 so I'm going to change my strings\n 10391 10:06:03,182 --> 10:06:05,641 and what I technically\nam going to store in t 10392 10:06:05,641 --> 10:06:09,692 is the address of an\navailable chunk of memory. 10393 10:06:09,692 --> 10:06:13,891 To do that, I can ask the computer\nto allocate memory for me 10394 10:06:15,302 --> 10:06:18,542 If I want to create a copy\nof h-i exclamation point 10395 10:06:22,991 --> 10:06:27,252 Because I need the h, the i, the\n 10396 10:06:28,362 --> 10:06:30,522 It's up to me to understand\nthat and ask for it. 10397 10:06:30,521 --> 10:06:32,051 It's not going to happen magically. 10398 10:06:32,052 --> 10:06:35,962 Nothing does in C. So I could\njust naively type four there 10399 10:06:35,961 --> 10:06:38,862 and that would be correct\nif I type in h-i exclamation 10400 10:06:38,862 --> 10:06:42,792 point or any other three letter word\n 10401 10:06:42,792 --> 10:06:46,122 I should probably do\nsomething like strlen of s 10402 10:06:46,122 --> 10:06:49,692 plus 1 for the additional\nnull character. 10403 10:06:49,692 --> 10:06:52,182 Recall that string length\ndoes it in the English sense-- 10404 10:06:52,182 --> 10:06:56,351 it returns the length of the string\n 10405 10:06:56,351 --> 10:06:58,601 the fact that I'm going\nto need that backslash n. 10406 10:06:58,601 --> 10:07:00,972 Now let me do this old\nschool style first. 10407 10:07:00,972 --> 10:07:05,711 Let me go ahead and manually\ncopy the string s into t first. 10408 10:07:05,711 --> 10:07:13,572 So for int i equals 0, i is less than\n 10409 10:07:13,572 --> 10:07:18,521 Then inside my for loop, I'm going\n 10410 10:07:18,521 --> 10:07:22,572 i, but actually I want\nthe null character too 10411 10:07:22,572 --> 10:07:25,362 so I want to do the length\nof the string plus 1 more 10412 10:07:25,362 --> 10:07:28,031 and heck, I think I learned\nan optimization last time. 10413 10:07:28,031 --> 10:07:30,491 If I'm doing this again\nand again, I could really 10414 10:07:30,491 --> 10:07:36,222 do n equals strlen of s plus 1\nand then do i is less than n 10415 10:07:36,222 --> 10:07:38,722 just as a nice design optimization. 10416 10:07:38,722 --> 10:07:41,891 I think this for loop will\nactually handle the process, then 10417 10:07:41,891 --> 10:07:48,701 of copying every character from s into\n 10418 10:07:48,701 --> 10:07:52,031 Or I could get rid of all of that\n 10419 10:07:52,031 --> 10:07:56,201 is to use str copy, which takes as\n 10420 10:07:56,201 --> 10:07:58,661 and its second argument the source. 10421 10:07:58,661 --> 10:08:03,641 So copy from right to left in this case,\n 10422 10:08:03,641 --> 10:08:06,591 automatically for me as well. 10423 10:08:08,781 --> 10:08:10,762 I can now capitalize safely. 10424 10:08:10,762 --> 10:08:14,802 The first character in t, which\n 10425 10:08:14,802 --> 10:08:18,802 than s, and then I can print them both\n 10426 10:08:19,811 --> 10:08:22,691 So make copy-- all right,\nwhat did I do wrong? 10427 10:08:22,692 --> 10:08:25,781 Implicitly declaring library\nfunction malloc dot, dot, dot. 10428 10:08:25,781 --> 10:08:28,421 So we've seen this kind of error before. 10429 10:08:28,421 --> 10:08:31,511 What is-- even if you don't\nknow quite how to solve it 10430 10:08:31,512 --> 10:08:33,042 what's the essence of the solution? 10431 10:08:33,042 --> 10:08:36,072 What do I need to do to fix this\n 10432 10:08:36,072 --> 10:08:38,631 declaring a library function? 10433 10:08:41,572 --> 10:08:42,921 I need to include the library. 10434 10:08:42,921 --> 10:08:46,911 And I could look this up in the manual,\n 10435 10:08:47,722 --> 10:08:49,822 There's another library\nwe'll occasionally 10436 10:08:49,822 --> 10:08:51,921 need now called standard lib-- 10437 10:08:51,921 --> 10:08:56,031 standard library-- that contains\nmalloc and free prototypes 10438 10:08:57,381 --> 10:09:00,421 All right, let me just clear this\n 10439 10:09:00,421 --> 10:09:06,322 Now I'm good. ./copy, Enter, All right.\n 10440 10:09:06,322 --> 10:09:10,131 t and s now come back as intended. 10441 10:09:10,131 --> 10:09:15,322 s is untouched, it would seem,\nbut t is now capitalized. 10442 10:09:15,322 --> 10:09:18,711 Are any questions, then, on\nwhat we just did in code? 10443 10:09:20,533 --> 10:09:23,942 AUDIENCE: You said that\nmalloc and free go together. 10444 10:09:28,411 --> 10:09:30,453 There's a few improvements\nI want to make, so let 10445 10:09:30,453 --> 10:09:32,011 me actually do those right now. 10446 10:09:32,012 --> 10:09:35,042 Technically, I should practice what\n 10447 10:09:35,042 --> 10:09:37,459 when I'm done with t, free t. 10448 10:09:37,459 --> 10:09:39,542 Fortunately, I don't have\nto worry about how big t 10449 10:09:39,542 --> 10:09:43,052 was-- the computer remembers how many\n 10450 10:09:43,052 --> 10:09:44,732 all of them, not just the first. 10451 10:09:46,442 --> 10:09:49,112 I don't need to do free\ns, and I shouldn't 10452 10:09:49,112 --> 10:09:52,052 because that is handled\nautomatically by the CS50 library. 10453 10:09:52,052 --> 10:09:54,452 s, recall, came from\nGetString, and we actually 10454 10:09:54,451 --> 10:09:56,829 have some fancy code in\nplace that makes sure 10455 10:09:56,830 --> 10:09:58,622 that at the end of your\nprogram's execution 10456 10:09:58,622 --> 10:10:01,682 we free any memory that we\nallocated so we don't actually 10457 10:10:01,682 --> 10:10:03,616 waste memory like I described earlier. 10458 10:10:03,616 --> 10:10:05,491 But there's actually a\ncouple of other things 10459 10:10:05,491 --> 10:10:07,991 if I really want to be\npedantic I should put in here. 10460 10:10:07,991 --> 10:10:11,432 It turns out that\nsometimes malloc can fail 10461 10:10:11,432 --> 10:10:14,169 and sometimes malloc doesn't\nhave enough memory available 10462 10:10:14,169 --> 10:10:15,961 because maybe your\ncomputer's doing so much 10463 10:10:15,961 --> 10:10:18,061 stuff there's just no\nmore RAM available. 10464 10:10:18,061 --> 10:10:20,341 So technically, I should\ndo something like this-- 10465 10:10:20,341 --> 10:10:24,901 if t equals equals null,\nwith two L's today 10466 10:10:24,902 --> 10:10:28,112 then I should just return 1 or something\n 10467 10:10:28,112 --> 10:10:29,987 I should probably print\nan error message too 10468 10:10:29,987 --> 10:10:31,662 but for now I'm going to keep it simple. 10469 10:10:31,661 --> 10:10:33,886 I should also probably check this. 10470 10:10:33,887 --> 10:10:36,211 This is a little risky of me. 10471 10:10:36,211 --> 10:10:40,872 If I'm doing t bracket zero, this is\n 10472 10:10:40,872 --> 10:10:43,592 But what if the human just\nhit Enter at the prompt 10473 10:10:43,591 --> 10:10:46,752 and didn't even type h, let\nalone h-i exclamation point? 10474 10:10:46,752 --> 10:10:48,991 What if there is no t bracket zero? 10475 10:10:48,991 --> 10:10:54,542 So technically, what I should probably\n 10476 10:10:54,542 --> 10:11:00,482 is at least greater than zero,\n 10477 10:11:01,802 --> 10:11:04,092 And then at the very\nend if all goes well 10478 10:11:04,091 --> 10:11:08,201 I can return zero, thereby signifying\n 10479 10:11:08,201 --> 10:11:12,072 So yes, these two functions, malloc\n 10480 10:11:12,072 --> 10:11:17,012 And so if you call malloc you\nshould call free eventually. 10481 10:11:17,012 --> 10:11:22,617 But you did not call malloc for s,\n 10482 10:11:23,491 --> 10:11:24,658 AUDIENCE: Here's a question. 10483 10:11:24,658 --> 10:11:26,940 Why do we do malloc plus 1? 10484 10:11:26,940 --> 10:11:28,732 DAVID J. MALAN: Why\ndid I do malloc plus 1? 10485 10:11:28,732 --> 10:11:31,642 So malloc-- sorry, malloc\nof string length of s 10486 10:11:31,641 --> 10:11:35,264 plus 1-- the string length is the\n 10487 10:11:35,264 --> 10:11:36,472 would perceive it in English. 10488 10:11:36,472 --> 10:11:39,472 So h-i exclamation\npoint-- strlen gives me 3 10489 10:11:39,472 --> 10:11:43,162 but I know now as of last week and\n 10490 10:11:43,161 --> 10:11:45,111 and a string always has an extra byte. 10491 10:11:45,112 --> 10:11:47,662 The onus is on me to\nunderstand and apply 10492 10:11:47,661 --> 10:11:52,371 that lesson learned so that I actually\n 10493 10:11:53,991 --> 10:11:59,661 And here's just an annoying thing when\n 10494 10:11:59,661 --> 10:12:03,711 week, it turns out that\nN-U-L-L is the same idea. 10495 10:12:03,711 --> 10:12:06,891 It's also zero, but it's zero\nin the context of pointer. 10496 10:12:06,891 --> 10:12:11,122 So long story short, you never\n 10497 10:12:11,122 --> 10:12:12,412 and we saw it on the screen. 10498 10:12:12,411 --> 10:12:17,991 You will start writing N-U-L-L when you\n 10499 10:12:19,042 --> 10:12:20,452 And what I mean by that is this. 10500 10:12:20,451 --> 10:12:23,331 If malloc fails and there's just\nnot enough memory left inside 10501 10:12:23,332 --> 10:12:26,632 of the computer for you, it's\ngot to return a special value 10502 10:12:26,631 --> 10:12:30,561 and that special value is\nN-U-L-L in all capital letters. 10503 10:12:30,561 --> 10:12:32,182 That signifies something went wrong. 10504 10:12:32,182 --> 10:12:37,131 Do not trust that I'm giving\nyou a useful return value. 10505 10:12:37,131 --> 10:12:40,752 Other questions on\nthese copies thus far? 10506 10:12:46,841 --> 10:12:48,091 DAVID J. MALAN: Good question. 10507 10:12:48,091 --> 10:12:49,981 Will str copy not work without malloc? 10508 10:12:49,982 --> 10:12:53,252 You kind of need both in\nthis case because str copy 10509 10:12:53,252 --> 10:12:56,641 by definition-- if I pull up its\n 10510 10:12:56,641 --> 10:12:58,622 to put the copied characters. 10511 10:12:58,622 --> 10:13:01,682 It's not sufficient just to\nsay char star t semicolon. 10512 10:13:01,682 --> 10:13:03,122 That only gives you a pointer. 10513 10:13:03,122 --> 10:13:06,062 But I need another\nchunk of memory that's 10514 10:13:06,061 --> 10:13:10,171 just as big as h-i exclamation\npoint backslash zero 10515 10:13:10,171 --> 10:13:12,631 so malloc gives me a\nwhole bunch of memory 10516 10:13:12,631 --> 10:13:16,921 and then str copy fills it with h-i\n 10517 10:13:16,921 --> 10:13:19,381 So again, that's why we're\ngoing down to this lower level 10518 10:13:19,381 --> 10:13:21,423 because once you understand\nwhat needs to be done 10519 10:13:21,423 --> 10:13:23,292 you now have the functions to do it. 10520 10:13:23,292 --> 10:13:25,332 So let's actually consider\nwhat we just solved. 10521 10:13:25,332 --> 10:13:29,192 So in this next version of the program\n 10522 10:13:29,192 --> 10:13:32,702 t was initialized for the\nreturn value of malloc 10523 10:13:32,701 --> 10:13:34,741 and maybe the memory that\nI got back was here-- 10524 10:13:38,341 --> 10:13:40,651 I've left it blank\ninitially because nothing 10525 10:13:40,652 --> 10:13:42,362 is put there automatically by malloc. 10526 10:13:42,362 --> 10:13:46,472 I just get a chunk of memory that\n 10527 10:13:46,472 --> 10:13:51,391 I then assign t to that return value,\n 10528 10:13:51,391 --> 10:13:53,222 Notice there's no backslash zero. 10529 10:13:53,222 --> 10:13:56,101 This is not yet a string\nit's just a chunk of memory-- 10530 10:13:56,101 --> 10:13:58,231 four bytes-- an array of four bytes. 10531 10:13:58,232 --> 10:14:01,802 What str copy eventually did\nfor me was it copied the h over 10532 10:14:01,802 --> 10:14:06,032 the i over, the exclamation point\nover, and the backslash zero. 10533 10:14:06,031 --> 10:14:09,902 And if I didn't want to use str copy or\n 10534 10:14:09,902 --> 10:14:14,062 would have done exactly the same thing. 10535 10:14:14,061 --> 10:14:19,178 Are any questions, then,\non these examples here. 10536 10:14:28,491 --> 10:14:29,741 DAVID J. MALAN: Good question. 10537 10:14:29,741 --> 10:14:34,091 After malloc, if I had then\nstill done just t equals s 10538 10:14:34,091 --> 10:14:37,211 it actually would have recreated\nthe same original problem 10539 10:14:37,211 --> 10:14:40,932 by just copying 0x123 from s into t. 10540 10:14:40,932 --> 10:14:44,112 So then I would have been left with\n 10541 10:14:44,112 --> 10:14:48,072 steps ago, I would have-- and\nI can't quite do it live-- 10542 10:14:48,072 --> 10:14:50,381 this arrow, if I did\nwhat you just described 10543 10:14:50,381 --> 10:14:54,358 would now be pointing over here and so\n 10544 10:14:54,358 --> 10:14:56,441 the problem, I would have\njust additionally wasted 10545 10:14:56,442 --> 10:14:59,502 four bytes temporarily that\nI'm not actually using. 10546 10:15:06,222 --> 10:15:08,180 do you always use malloc\nand str copy together? 10547 10:15:08,955 --> 10:15:10,872 These are both solving\ntwo different problems. 10548 10:15:10,872 --> 10:15:15,132 malloc's giving me enough memory to\n 10549 10:15:15,131 --> 10:15:18,941 However, you could actually use an\n 10550 10:15:18,942 --> 10:15:22,272 and you could use str copy on that, and\n 10551 10:15:22,271 --> 10:15:24,432 But thus far, it's a\nreasonable mental model 10552 10:15:24,432 --> 10:15:26,652 to have that if you\nwant to copy strings 10553 10:15:26,652 --> 10:15:30,281 you use malloc and then str\ncopy, or your own homegrown loop. 10554 10:15:42,531 --> 10:15:44,730 DAVID J. MALAN: Say that once more. 10555 10:15:55,531 --> 10:15:58,801 str copy, per its documentation,\nwill copy the whole string 10556 10:15:58,802 --> 10:16:01,022 plus the null character at the end. 10557 10:16:01,021 --> 10:16:03,481 It just assumes there will be one there. 10558 10:16:03,482 --> 10:16:07,652 It's therefore up to you to pass str\n 10559 10:16:08,641 --> 10:16:10,832 If I only ask malloc\nfor three bytes, that 10560 10:16:10,832 --> 10:16:12,902 could have potentially\ncreated a memory problem 10561 10:16:12,902 --> 10:16:16,262 whereby str copy would just still\nblindly copy one, two, three 10562 10:16:16,262 --> 10:16:19,802 four bytes, but technically it should\n 10563 10:16:19,802 --> 10:16:22,652 You do not yet have access to the\n 10564 10:16:22,652 --> 10:16:24,902 because you never asked malloc for it. 10565 10:16:26,822 --> 10:16:29,822 AUDIENCE: So the number inside\n 10566 10:16:30,182 --> 10:16:32,057 The number inside malloc--\nit's one argument. 10567 10:16:32,057 --> 10:16:35,084 It's the number of bytes you want back. 10568 10:16:35,084 --> 10:16:38,402 AUDIENCE: Does that mean you\nhave to remember [INAUDIBLE]?? 10569 10:16:41,158 --> 10:16:43,491 DAVID J. MALAN: Yes, the onus\nis on you, the programmer 10570 10:16:43,491 --> 10:16:45,658 to remember or frankly, use\na function to figure out 10571 10:16:45,658 --> 10:16:47,182 how many bytes you actually need. 10572 10:16:47,182 --> 10:16:50,031 That's why I did not ultimately\ntype in four manually 10573 10:16:51,802 --> 10:16:55,192 So the plus 1 is necessary if you\n 10574 10:16:55,192 --> 10:16:57,832 but using strlen means\nthat I can actually 10575 10:16:57,832 --> 10:17:01,012 play around with any types of\ninputs and it will dynamically 10576 10:17:02,902 --> 10:17:05,182 So suffice it to say,\nthere's so many ways 10577 10:17:05,182 --> 10:17:07,292 already where you can\nstart to break programs. 10578 10:17:07,292 --> 10:17:10,747 Let's give you at least one tool for\n 10579 10:17:10,747 --> 10:17:12,622 And indeed, in upcoming\nproblem sets you will 10580 10:17:12,622 --> 10:17:14,722 use this to find bugs in your own code. 10581 10:17:14,722 --> 10:17:18,351 Not just using printf, not just using\n 10582 10:17:19,561 --> 10:17:22,731 So let me go ahead and deliberately\n 10583 10:17:22,732 --> 10:17:24,872 that has some memory-related errors. 10584 10:17:24,872 --> 10:17:30,262 Let me include stdio.h at the top and\n 10585 10:17:30,262 --> 10:17:31,912 so I have access to malloc now. 10586 10:17:31,911 --> 10:17:36,531 Let me do int main(void) and then\n 10587 10:17:36,531 --> 10:17:39,711 I want to allocate\nmaybe how about three-- 10588 10:17:41,572 --> 10:17:43,552 Just for the sake of discussion. 10589 10:17:43,552 --> 10:17:48,082 So I'm going to go ahead and do malloc\n 10590 10:17:48,082 --> 10:17:51,368 I want three integers and\nan integer is four bytes 10591 10:17:51,368 --> 10:17:52,701 so technically I could do this-- 10592 10:17:52,701 --> 10:17:57,211 3 times 4, or I could do 12 but again,\n 10593 10:17:57,211 --> 10:17:59,701 and if I run this program on\na slightly different computer 10594 10:17:59,701 --> 10:18:01,221 int might be a different size. 10595 10:18:01,222 --> 10:18:05,682 so the better way to do this would be\n 10596 10:18:05,682 --> 10:18:08,932 And this is just an operator you can use\n 10597 10:18:08,932 --> 10:18:10,972 on this computer, how big is an int? 10598 10:18:10,972 --> 10:18:13,652 How big is a float, or something else? 10599 10:18:13,652 --> 10:18:15,772 So that's going to give me that many-- 10600 10:18:15,771 --> 10:18:18,171 that much memory for three ints. 10601 10:18:18,171 --> 10:18:20,182 What do I want to assign this to? 10602 10:18:20,182 --> 10:18:22,372 Well, malloc returns an address. 10603 10:18:22,372 --> 10:18:27,652 Pointers are addresses, so I'm going\n 10604 10:18:31,101 --> 10:18:33,682 This is a little less obvious,\nbut again go back to basics. 10605 10:18:33,682 --> 10:18:38,451 The right hand side here gives me a\n 10606 10:18:38,451 --> 10:18:42,021 malloc returns the address of\nthe first byte of that chunk. 10607 10:18:42,021 --> 10:18:44,151 How do I store the address of anything? 10608 10:18:45,052 --> 10:18:48,922 The syntax for today\nis type of data, star 10609 10:18:48,921 --> 10:18:53,991 where the type of data in question\n 10610 10:18:53,991 --> 10:18:57,891 Again, it's kind of purposeless, only\n 10611 10:18:57,891 --> 10:19:03,262 here, but this is equivalent now to\n 10612 10:19:03,262 --> 10:19:06,711 in total, presumably, so I\ncan technically now do this. 10613 10:19:06,711 --> 10:19:10,851 I can go into maybe the first\n 10614 10:19:12,271 --> 10:19:20,061 Second location, the number 73, and the\n 10615 10:19:20,061 --> 10:19:22,911 Now I've deliberately\nmade two mistakes here 10616 10:19:22,911 --> 10:19:26,061 because I'm trying to trip\nover my newfound understanding 10617 10:19:26,061 --> 10:19:28,641 or my greenness with\nunderstanding pointers. 10618 10:19:28,641 --> 10:19:32,002 One, I didn't remember that I\n 10619 10:19:33,112 --> 10:19:36,502 malloc essentially returns an array,\n 10620 10:19:36,502 --> 10:19:38,902 An array of three ints,\nor more technically 10621 10:19:38,902 --> 10:19:42,741 the address of a chunk of memory\nthat could fit three ints. 10622 10:19:42,741 --> 10:19:46,042 So I can use my square bracket\n 10623 10:19:46,042 --> 10:19:48,991 and use pointer arithmetic, but\n 10624 10:19:48,991 --> 10:19:50,841 But I have made two mistakes. 10625 10:19:50,841 --> 10:19:54,441 I did not start indexing\nat zero, so line seven 10626 10:19:54,442 --> 10:19:56,302 should have been x bracket zero. 10627 10:19:56,302 --> 10:19:59,174 Line eight should have been x\nbracket 1, and then line nine 10628 10:19:59,173 --> 10:20:00,381 should have been x bracket 2. 10629 10:20:01,591 --> 10:20:04,521 The second mistake that\nI've made as a side effect 10630 10:20:04,521 --> 10:20:07,581 is I'm also touching\nmemory that I shouldn't. 10631 10:20:07,582 --> 10:20:12,531 x bracket 3 would mean go to the\n 10632 10:20:13,341 --> 10:20:15,861 I only asked for enough\nmemory for three ints 10633 10:20:15,862 --> 10:20:19,101 not four, so this is what's\ncalled a buffer overflow. 10634 10:20:19,101 --> 10:20:22,191 I am accidentally, but\ndeliberately at the moment 10635 10:20:22,192 --> 10:20:26,312 going beyond the boundaries of\nthis array, this chunk of memory. 10636 10:20:26,311 --> 10:20:28,671 So bad things happen,\nbut not necessarily 10637 10:20:28,671 --> 10:20:30,002 by just running your program. 10638 10:20:30,002 --> 10:20:31,552 Let me go ahead and just try this. 10639 10:20:31,552 --> 10:20:37,372 Make memory, and you'll see here\nthat it compiles OK. ./memory 10640 10:20:37,372 --> 10:20:39,500 and it actually does\nnot segmentation fault 10641 10:20:39,500 --> 10:20:41,542 which comes back to that\npoint of nondeterminism. 10642 10:20:41,542 --> 10:20:43,912 Sometimes it does, sometimes it\ndoesn't-- it depends on how bad 10643 10:20:45,052 --> 10:20:48,219 But there's a program that can\nspot these kinds of mistakes 10644 10:20:48,218 --> 10:20:51,051 and I'm going to go ahead and expand\n 10645 10:20:51,052 --> 10:20:56,512 and I'm going to run not just ./memory,\n 10646 10:20:56,512 --> 10:20:59,362 This is a command that comes\nwith a lot of computer systems 10647 10:20:59,362 --> 10:21:02,432 that's designed to find\nmemory-related bugs in code. 10648 10:21:02,432 --> 10:21:04,372 So it's a new tool in\nyour toolkit today 10649 10:21:04,372 --> 10:21:06,472 and you'll use it with\nthe coming problem sets. 10650 10:21:07,671 --> 10:21:09,951 It's output, honestly, it's hideous. 10651 10:21:09,951 --> 10:21:13,341 But there's a few things\nthat will start to jump out 10652 10:21:13,341 --> 10:21:15,741 and will help you with\ntools and the problems 10653 10:21:15,741 --> 10:21:17,311 sets to see these kinds of things. 10654 10:21:21,832 --> 10:21:25,822 That's on memory.c line\nnine, per my highlights. 10655 10:21:25,822 --> 10:21:27,711 So let me go look at line nine. 10656 10:21:27,711 --> 10:21:31,372 In what sense is this an\ninvalid write of size four? 10657 10:21:31,372 --> 10:21:33,952 Well, I'm touching memory\nthat I shouldn't, and I'm 10658 10:21:33,951 --> 10:21:35,421 touching it as though it's an int. 10659 10:21:35,421 --> 10:21:37,911 And an int is four bytes-- size four. 10660 10:21:37,911 --> 10:21:41,191 So again, this takes some practice to\n 10661 10:21:41,192 --> 10:21:44,132 but this is now a clue\nfor me, the programmer 10662 10:21:44,131 --> 10:21:47,591 that not only did I screw up, but\nI screwed up related to memory 10663 10:21:47,591 --> 10:21:50,109 and so this is just a hint, if you will. 10664 10:21:50,110 --> 10:21:52,652 It's not going to necessarily\ntell you exactly how to fix it 10665 10:21:52,652 --> 10:21:56,491 you have to wrestle with\nthe semantics, but invalid 10666 10:21:56,491 --> 10:21:58,322 write of size four-- oh, OK. 10667 10:21:58,322 --> 10:22:02,682 So I should not have indexed\npast the boundary here. 10668 10:22:02,682 --> 10:22:05,381 All right, so I\nshouldn't have done that. 10669 10:22:05,381 --> 10:22:11,125 So let me go ahead then and change this\n 10670 10:22:11,125 --> 10:22:13,292 All right, so let me go\nahead and recompile my code. 10671 10:22:13,292 --> 10:22:19,622 Make memory, ./memory, still doesn't\n 10672 10:22:20,252 --> 10:22:26,461 Let me go ahead and run Valgrind\n 10673 10:22:26,461 --> 10:22:28,682 And now there's fewer scary-- 10674 10:22:28,682 --> 10:22:32,201 less scary output now, but\nthere's still something in there. 10675 10:22:32,201 --> 10:22:35,728 Notice this-- 12 bytes in one blocks-- 10676 10:22:35,728 --> 10:22:37,561 no regard for grammar\nthere-- are definitely 10677 10:22:37,561 --> 10:22:39,331 lost in lost record one of one. 10678 10:22:39,332 --> 10:22:42,972 Super cryptic, but this is hinting\nat a so-called memory leak. 10679 10:22:42,972 --> 10:22:46,802 The blocks of memory are lost in\n 10680 10:22:46,802 --> 10:22:48,242 I asked for them but I never-- 10681 10:22:51,368 --> 10:22:53,701 And this is the arcane way\nof saying, you've screwed up. 10682 10:22:54,911 --> 10:22:57,181 So this is an easy fix, fortunately. 10683 10:22:57,182 --> 10:23:01,572 Once I'm done with this memory I\n 10684 10:23:01,572 --> 10:23:03,991 So now let me go ahead\nand rerun make memory 10685 10:23:03,991 --> 10:23:07,801 it's still runs fine so all the while\n 10686 10:23:08,942 --> 10:23:10,622 But let me run Valgrind one more time. 10687 10:23:14,701 --> 10:23:16,891 All heap blocks were\nfreed, whatever that means. 10688 10:23:18,732 --> 10:23:21,842 And even though it's still a little\n 10689 10:23:21,841 --> 10:23:25,345 and in fact, it's pretty explicit--\n 10690 10:23:27,002 --> 10:23:30,192 So even though this is one of\nthe most arcane tools we'll use 10691 10:23:30,192 --> 10:23:32,702 it's also one of the most\npowerful because it can see things 10692 10:23:32,701 --> 10:23:36,031 that you, the human, might not, and\n 10693 10:23:36,031 --> 10:23:38,101 It does a much closer\nreading of your code 10694 10:23:38,101 --> 10:23:43,862 while it's running to figure\nout exactly what is going on. 10695 10:23:43,862 --> 10:23:46,141 Any questions, then, on this tool? 10696 10:23:46,141 --> 10:23:50,042 And we'll guide you after today\nwith actually using this, too. 10697 10:23:50,042 --> 10:23:52,561 Just helps you find\nmemory-related mistakes 10698 10:23:52,561 --> 10:23:55,381 that you might now be capable of making. 10699 10:23:55,381 --> 10:23:57,542 All right, let's do one\nother memory-related thing. 10700 10:23:57,542 --> 10:23:59,531 Let me shrink my terminal window here. 10701 10:23:59,531 --> 10:24:03,271 Let me create one other\nfile here called garbage.c. 10702 10:24:03,271 --> 10:24:06,781 It turns out there's a term of ours\n 10703 10:24:06,781 --> 10:24:08,292 that we can reveal as follows. 10704 10:24:08,292 --> 10:24:11,281 Let me include stdio.h,\nand let me include-- 10705 10:24:11,281 --> 10:24:14,822 how about stdlib.h, and\nthen let me give myself int 10706 10:24:14,822 --> 10:24:17,921 main(void), and then in this\nrelatively short program 10707 10:24:17,921 --> 10:24:20,822 let me give myself three\nints using last week's 10708 10:24:20,822 --> 10:24:24,781 notation, just int scores bracket\n 10709 10:24:24,781 --> 10:24:28,801 Then let me go ahead and do for\n 10710 10:24:28,802 --> 10:24:34,052 i plus plus, then let me go ahead\nand print out, %i backslash n 10711 10:24:38,851 --> 10:24:44,141 This code, pretty sure is going\n 10712 10:24:46,531 --> 10:24:51,061 I've forgotten a step even though the\n 10713 10:24:53,792 --> 10:24:56,281 Yeah, I didn't provide the\nscores, so I didn't actually 10714 10:24:56,281 --> 10:25:00,211 initialize the array called scores\n 10715 10:25:00,211 --> 10:25:03,752 What's curious about this, though,\n 10716 10:25:04,442 --> 10:25:08,402 Let me go ahead and playfully\nmake garbage, Enter 10717 10:25:08,402 --> 10:25:10,982 and it's an apt description\nbecause what I'm about to see 10718 10:25:10,982 --> 10:25:13,592 are so-called garbage values. 10719 10:25:13,591 --> 10:25:18,421 When you, the programmer, do not\n 10720 10:25:18,421 --> 10:25:21,239 values, sometimes, who knows\nwhat's going to be there. 10721 10:25:21,239 --> 10:25:23,072 The computer's been\ndoing some other things 10722 10:25:23,072 --> 10:25:26,521 there's a bit of work that happens even\n 10723 10:25:26,521 --> 10:25:29,761 so there might be remnants\nof past ints, chars, strings 10724 10:25:29,762 --> 10:25:32,402 floats-- anything else in\nthere and what you're seeing 10725 10:25:32,402 --> 10:25:38,022 is those garbage values, which is\n 10726 10:25:38,021 --> 10:25:40,961 as I just did, to initialize\nthe value of some variable. 10727 10:25:40,961 --> 10:25:42,961 And this is actually\npretty dangerous, and there 10728 10:25:42,961 --> 10:25:46,442 have been many examples of\nsoftware being compromised 10729 10:25:46,442 --> 10:25:49,622 because of one of these issues\n 10730 10:25:49,622 --> 10:25:53,972 and all of a sudden users, maybe people\n 10731 10:25:53,972 --> 10:25:57,842 applications, could suddenly see the\n 10732 10:25:58,951 --> 10:26:01,411 Maybe someone's password that\nhad been previously typed in 10733 10:26:01,411 --> 10:26:03,391 or some other value like\na credit card number 10734 10:26:03,391 --> 10:26:04,951 that had been previously typed in. 10735 10:26:04,951 --> 10:26:06,932 There are different\ndefense mechanisms in place 10736 10:26:06,932 --> 10:26:10,472 to generally make this not\nso likely, but it's certainly 10737 10:26:10,472 --> 10:26:13,531 very possible, at least\nin this kind of context 10738 10:26:13,531 --> 10:26:17,461 to see values that you\nprobably shouldn't because they 10739 10:26:17,461 --> 10:26:20,981 might be remnants from\nsomething else that used them. 10740 10:26:20,982 --> 10:26:25,062 So this is to say again, you have this\n 10741 10:26:25,061 --> 10:26:28,381 but also now you have this great\nhacking ability to poke around 10742 10:26:28,381 --> 10:26:31,801 the contents of memory, and this is\n 10743 10:26:31,802 --> 10:26:35,792 trying to find ways to exploit systems. 10744 10:26:40,432 --> 10:26:42,472 All right, let's go ahead and\ntake a quick five minute break 10745 10:26:42,472 --> 10:26:44,872 and when we come back, we'll\nbuild on these final topics. 10746 10:26:47,031 --> 10:26:50,841 First, just a little programmer\n 10747 10:26:50,841 --> 10:26:53,211 will make a little bit of sense to you. 10748 10:26:53,211 --> 10:26:57,682 And what we'll also do next to take a\n 10749 10:26:57,682 --> 10:27:00,862 animates with claymation, if you\n 10750 10:27:00,862 --> 10:27:03,862 exactly what happens now if you have\n 10751 10:27:03,862 --> 10:27:07,364 values are and how they get there, and\n 10752 10:27:07,364 --> 10:27:09,531 It's one thing just to print\nthem out as I just did 10753 10:27:09,531 --> 10:27:13,792 it's another if you actually mistake\n 10754 10:27:13,792 --> 10:27:17,241 because garbage values are just zeros\n 10755 10:27:17,241 --> 10:27:20,121 But if you use that new\ndereference operator, the star 10756 10:27:20,122 --> 10:27:24,472 and try to go to a garbage value\nthinking incorrectly that it's 10757 10:27:24,472 --> 10:27:26,872 a valid pointer, bad things can happen. 10758 10:27:26,872 --> 10:27:31,792 Computers can crash or more familiarly,\n 10759 10:27:31,792 --> 10:27:34,762 So allow me to introduce, if we\n 10760 10:27:34,762 --> 10:27:36,472 our friend Binky from Stanford. 10761 10:27:40,311 --> 10:27:41,901 SPEAKER 1: Hey Binky, wake up. 10762 10:27:48,544 --> 10:27:50,461 SPEAKER 1: Well, to get\nstarted, I guess we're 10763 10:27:50,461 --> 10:27:52,082 going to need a couple of pointers. 10764 10:27:52,082 --> 10:27:56,359 BINKY: OK, this code allocates two\n 10765 10:27:56,942 --> 10:28:00,549 Well, I see the two pointers, but they\n 10766 10:28:01,381 --> 10:28:03,511 Initially, pointers\ndon't point to anything. 10767 10:28:03,512 --> 10:28:06,542 The things they point to are called\n 10768 10:28:07,535 --> 10:28:08,702 SPEAKER 1: Oh, right, right. 10769 10:28:11,381 --> 10:28:13,711 So how do you allocate a pointee? 10770 10:28:13,711 --> 10:28:17,281 BINKY: OK, well this code\nallocates a new integer pointee 10771 10:28:17,281 --> 10:28:20,354 and this part sets x to point to it. 10772 10:28:20,355 --> 10:28:21,772 SPEAKER 1: Hey, that looks better. 10773 10:28:23,381 --> 10:28:26,771 BINKY: OK, I'll dereference the\npointer x to store the number 10774 10:28:28,902 --> 10:28:32,562 For this trick, I'll need my\nmagic wand of dereferencing. 10775 10:28:32,561 --> 10:28:35,951 SPEAKER 1: Your magic\nwand of dereferencing? 10776 10:28:37,802 --> 10:28:39,512 BINKY: This is what the code looks like. 10777 10:28:39,512 --> 10:28:42,307 I'll just set up the number and-- 10778 10:28:44,531 --> 10:28:49,451 So doing a dereference on x follows\n 10779 10:28:49,451 --> 10:28:51,491 in this case to store 42 in there. 10780 10:28:51,491 --> 10:28:56,112 Hey, try using it to store the number\n 10781 10:28:57,252 --> 10:29:01,631 I'll just go over here to y\nand get the number 13 set up 10782 10:29:01,631 --> 10:29:06,161 and then take the wand of\ndereferencing and just-- 10783 10:29:07,241 --> 10:29:09,461 SPEAKER 1: Oh hey, that didn't work. 10784 10:29:09,461 --> 10:29:13,182 Say, Binky, I don't think\ndereferencing y is a good idea 10785 10:29:13,182 --> 10:29:16,377 because setting up the\npointee is a separate step 10786 10:29:16,377 --> 10:29:18,912 and I don't think we ever did it. 10787 10:29:19,961 --> 10:29:22,391 SPEAKER 1: Yeah, we\nallocated the pointer y 10788 10:29:22,391 --> 10:29:25,631 but we never set it\nto point to a pointee. 10789 10:29:26,800 --> 10:29:28,842 SPEAKER 1: Hey, you're\nlooking good there, Binky. 10790 10:29:28,841 --> 10:29:31,721 Can you fix it so that y points\nto the same pointee as x? 10791 10:29:31,722 --> 10:29:35,082 BINKY: Sure, I'll use my magic\nwand of pointer assignment. 10792 10:29:35,082 --> 10:29:37,332 SPEAKER 1: Is that going to\nbe a problem, like before? 10793 10:29:37,332 --> 10:29:39,222 BINKY: No, this doesn't\ntouch the pointees 10794 10:29:39,222 --> 10:29:42,851 it just changes one pointer to\n 10795 10:29:43,872 --> 10:29:46,542 Now y points to the same place as x. 10796 10:29:48,432 --> 10:29:51,491 It has a pointee so you can try\nthe wand of dereferencing again 10797 10:29:56,434 --> 10:29:57,641 SPEAKER 1: Hey, look at that. 10798 10:29:57,641 --> 10:29:59,472 Now dereferencing works on y. 10799 10:29:59,472 --> 10:30:03,522 And because the pointers are sharing\n 10800 10:30:05,232 --> 10:30:07,272 So are we going to switch places now? 10801 10:30:07,271 --> 10:30:09,191 SPEAKER 1: Oh look, we're out of time. 10802 10:30:10,311 --> 10:30:12,531 That's from our friend\nNick Parlante at Stanford. 10803 10:30:12,531 --> 10:30:14,871 So let's consider what\nNick did here as Binky. 10804 10:30:14,872 --> 10:30:16,942 So here is all the code together. 10805 10:30:16,942 --> 10:30:20,619 These first couple of lines were not\n 10806 10:30:20,618 --> 10:30:21,951 they move the stars to the left. 10807 10:30:22,701 --> 10:30:25,612 Again, more conventional\nmight be this syntax here. 10808 10:30:26,822 --> 10:30:30,141 It's OK to create\nvariables, even pointers 10809 10:30:30,141 --> 10:30:33,771 and not assign them a value initially\n 10810 10:30:33,771 --> 10:30:36,291 So we eventually do\nhere, with this line. 10811 10:30:36,292 --> 10:30:39,351 We assign to x the return\nvalue of malloc, which 10812 10:30:39,351 --> 10:30:41,182 is presumably the address of something. 10813 10:30:41,182 --> 10:30:44,432 To be fair, we should really\nbe checking for null as well 10814 10:30:44,432 --> 10:30:46,351 but that's not the biggest problem here. 10815 10:30:46,351 --> 10:30:48,841 The biggest problem is\nnot even this next line 10816 10:30:48,841 --> 10:30:54,591 which means go to the memory location\n 10817 10:30:54,591 --> 10:30:56,811 That's fine, because\nagain, malloc returns 10818 10:30:56,811 --> 10:30:59,061 the address of some chunk of memory. 10819 10:30:59,061 --> 10:31:01,161 This chunk of memory is\nbig enough for an int. 10820 10:31:01,161 --> 10:31:04,072 x is therefore going to store\nthe address of that chunk that's 10821 10:31:05,031 --> 10:31:08,902 Star x recalls the dereference\n 10822 10:31:10,701 --> 10:31:13,822 It's like going to the mailbox\nand putting the number 42 in it 10823 10:31:13,822 --> 10:31:16,732 instead of taking the number\n50 out, like we did before. 10824 10:31:18,411 --> 10:31:21,651 This is where Binky lost\nhis head, so to speak. 10825 10:31:24,042 --> 10:31:26,042 AUDIENCE: We haven't yet\nallocated space for it. 10826 10:31:26,591 --> 10:31:28,502 We haven't yet allocated space for y. 10827 10:31:28,502 --> 10:31:31,412 There's no mention of malloc,\nthere's no assignment of y 10828 10:31:32,951 --> 10:31:35,801 So this would be, go\nto the address in y 10829 10:31:35,802 --> 10:31:39,192 but if there is no known address in\n 10830 10:31:39,192 --> 10:31:42,122 which means go to some random address\n 10831 10:31:42,932 --> 10:31:47,582 that might cause what we've seen in the\n 10832 10:31:47,582 --> 10:31:49,472 Now this, fortunately,\nis the kind of thing 10833 10:31:49,472 --> 10:31:53,402 that if you don't quite have the eye\n 10834 10:31:53,402 --> 10:31:55,272 could help you find as well. 10835 10:31:55,271 --> 10:31:59,041 But it's just another example of\n 10836 10:31:59,042 --> 10:32:02,472 of having control now\nover memory at this level. 10837 10:32:02,972 --> 10:32:04,805 Well, let's go ahead\nand do one other thing. 10838 10:32:04,805 --> 10:32:07,947 Considering from last week\nthat this notion of swapping 10839 10:32:07,947 --> 10:32:09,572 was actually a really common operation. 10840 10:32:09,572 --> 10:32:12,572 We had all of our volunteers come\n 10841 10:32:12,572 --> 10:32:14,942 during bubble sorts and\neven selection sort 10842 10:32:14,942 --> 10:32:17,042 and we just took for\ngranted that the two 10843 10:32:17,042 --> 10:32:18,974 humans would swap themselves just fine. 10844 10:32:18,974 --> 10:32:21,182 But there needs to be code\nto do that if you actually 10845 10:32:21,182 --> 10:32:24,999 implement bubble sort, selection sort,\n 10846 10:32:24,999 --> 10:32:26,582 So let's consider some code like this. 10847 10:32:26,582 --> 10:32:28,652 We'll keep it simple\nlike last week, and where 10848 10:32:28,652 --> 10:32:35,700 we wanted to swap some values like\n 10849 10:32:35,699 --> 10:32:38,491 Void because I'm not going to return\n 10850 10:32:39,391 --> 10:32:44,701 So here, for instance,\nmight be some code for this. 10851 10:32:44,701 --> 10:32:45,909 But why is it so complicated? 10852 10:32:45,910 --> 10:32:47,494 Here, let's actually take a step back. 10853 10:32:48,661 --> 10:32:50,281 I think we have time\nfor one more volunteer. 10854 10:32:50,281 --> 10:32:51,739 Could we get someone to come on up? 10855 10:32:51,739 --> 10:32:54,031 You have to be comfy\non camera and you're 10856 10:32:54,031 --> 10:32:57,061 being asked to help with your-- oh,\n 10857 10:32:57,061 --> 10:33:01,002 So whoever has their\nfriend doing this here-- 10858 10:33:01,982 --> 10:33:03,872 Now they're pointing it over here. 10859 10:33:03,872 --> 10:33:05,612 Now, literally an arm is being twisted. 10860 10:33:25,002 --> 10:33:27,078 Who were you trying to volunteer? 10861 10:33:29,332 --> 10:33:33,652 So here we have for Marina two\n 10862 10:33:33,652 --> 10:33:35,182 just so that they're super obvious. 10863 10:33:35,182 --> 10:33:37,586 And suppose that the problem\nat hand, like last week 10864 10:33:37,586 --> 10:33:40,461 it's just to swap two values, as\n 10865 10:33:40,461 --> 10:33:42,472 two people and we want to swap them. 10866 10:33:42,472 --> 10:33:45,862 But let's consider these glasses\n 10867 10:33:45,862 --> 10:33:47,572 in an array, and you know what? 10868 10:33:47,572 --> 10:33:50,042 I'd really like you to swap the values. 10869 10:33:50,042 --> 10:33:53,601 So orange has to go in there,\nand purple has to go in there. 10870 10:33:54,555 --> 10:33:56,722 And we'll see if we can\nthen translate that to code. 10871 10:34:02,472 --> 10:34:04,932 So presumably, you're\nstruggling mentally 10872 10:34:04,932 --> 10:34:08,141 with how you would do this without\n 10873 10:34:08,682 --> 10:34:11,552 Let me go ahead and we do have a\n 10874 10:34:11,552 --> 10:34:14,052 So if I hand you this, how would\nyou now solve this problem? 10875 10:34:16,542 --> 10:34:18,292 AUDIENCE: I would go\nlike that, but it's-- 10876 10:34:18,292 --> 10:34:18,942 DAVID J. MALAN: No, that's-- 10877 10:34:20,232 --> 10:34:23,342 Go do it-- go with your instincts. 10878 10:34:26,042 --> 10:34:28,171 Go to whatever your instincts are. 10879 10:34:34,561 --> 10:34:37,188 Yeah, so a little-- so\nstrictly speaking, probably 10880 10:34:37,188 --> 10:34:39,271 shouldn't have moved the\nglasses just because that 10881 10:34:39,271 --> 10:34:41,291 would be like moving\nthe array locations 10882 10:34:41,292 --> 10:34:43,972 so let's actually do it one\nmore time but the glasses now 10883 10:34:43,972 --> 10:34:45,722 have to go back where\nthey originally are. 10884 10:34:45,722 --> 10:34:50,412 So how would you swap these now,\nusing this temporary variable? 10885 10:34:51,836 --> 10:34:54,461 Otherwise we'd be completely\nuprooting the array, for instance 10886 10:34:54,461 --> 10:34:56,442 by just physically moving it around. 10887 10:34:56,442 --> 10:34:58,932 So you moved the orange into\nthis temporary variable 10888 10:34:58,932 --> 10:35:01,271 then you copied the purple\ninto where the orange was 10889 10:35:01,271 --> 10:35:03,641 and now, presumably, excellent. 10890 10:35:03,641 --> 10:35:06,461 The orange is going to end\nup where the purple once was 10891 10:35:06,461 --> 10:35:08,981 and this temporary variable,\nit stored up some extra memory. 10892 10:35:08,982 --> 10:35:11,802 It was necessary at the time,\nbut not necessary, ultimately. 10893 10:35:11,802 --> 10:35:17,492 But a round of applause if we could,\n 10894 10:35:17,491 --> 10:35:21,671 So the fact that it\ninstantly occurred to Mariana 10895 10:35:21,671 --> 10:35:25,072 that you need some temporary variable\n 10896 10:35:25,072 --> 10:35:28,311 and in fact this code here,\nthat we might glimpse now 10897 10:35:28,311 --> 10:35:30,398 is reminiscent of\nexactly that algorithm 10898 10:35:30,398 --> 10:35:33,231 where A and B, at the end of the\n 10899 10:35:33,232 --> 10:35:35,242 Just like the second\ntime, the two glasses 10900 10:35:35,241 --> 10:35:37,641 have to kind of stay put, even\n 10901 10:35:37,641 --> 10:35:39,391 but they're going back\nto where they were 10902 10:35:39,391 --> 10:35:41,391 is kind of like having\ntwo values, A and B 10903 10:35:41,391 --> 10:35:44,451 and you just have a temporary\nvariable into which you copy A 10904 10:35:44,451 --> 10:35:47,691 then you change A with\nB, then you go and change 10905 10:35:47,692 --> 10:35:50,632 B with whatever the\noriginal value of A was 10906 10:35:50,631 --> 10:35:55,281 because you temporarily stored it\n 10907 10:35:55,281 --> 10:35:59,521 Unfortunately, this code doesn't\nnecessarily work as intended. 10908 10:35:59,521 --> 10:36:02,752 So let me go over to my\nVS Code here and open up 10909 10:36:02,752 --> 10:36:06,021 a program called swap.c,\nand in swap.c, let 10910 10:36:06,021 --> 10:36:11,002 me whip up something really quickly\n 10911 10:36:12,921 --> 10:36:18,112 Inside of main let me do something\nlike x gets 1 and y gets 2. 10912 10:36:18,112 --> 10:36:23,241 Let me just print out as a\nvisual confirmation that x is %i 10913 10:36:23,241 --> 10:36:28,252 y is %i backslash n, plugging\nin x and y, respectively. 10914 10:36:28,252 --> 10:36:31,432 Then let me call a swap function\n 10915 10:36:31,432 --> 10:36:38,122 Swap x and y And then let me print out\n 10916 10:36:38,122 --> 10:36:41,692 just to print out again what they are,\n 10917 10:36:41,692 --> 10:36:44,855 2 first, then 2, 1 the second time. 10918 10:36:44,855 --> 10:36:46,522 Now how is swap going to be implemented? 10919 10:36:46,521 --> 10:36:49,951 Let me implement it exactly\nas on the screen a moment ago. 10920 10:36:52,372 --> 10:36:54,862 or let's call it int A\nfor consistency, int B. 10921 10:36:54,862 --> 10:36:57,022 But I could always call\nthose anything I want. 10922 10:36:57,021 --> 10:37:01,252 Int tmp gets A, A gets B, B gets tmp. 10923 10:37:01,252 --> 10:37:04,341 So exactly as I proposed\na moment ago, and exactly 10924 10:37:04,341 --> 10:37:08,121 as Mariana really implemented\nit using these glasses of water. 10925 10:37:08,122 --> 10:37:11,932 I need to now include my prototype,\n 10926 10:37:11,932 --> 10:37:15,622 And I'll just copy/paste that up here,\n 10927 10:37:15,622 --> 10:37:18,832 So make swap-- so far, so good-- swap-- 10928 10:37:18,832 --> 10:37:23,692 x is now 1, y is 2, x is 1, y is 2. 10929 10:37:23,692 --> 10:37:29,452 So there seems to be a bit of a\nbug here, but why might this be? 10930 10:37:29,451 --> 10:37:33,291 This code does not in fact work, even\n 10931 10:37:35,086 --> 10:37:41,600 AUDIENCE: Because A and B have different\n 10932 10:37:41,599 --> 10:37:43,391 DAVID J. MALAN: Good,\nand let me summarize. 10933 10:37:43,391 --> 10:37:46,722 A and B do indeed have\ndifferent addresses of x and y 10934 10:37:46,722 --> 10:37:50,322 and in fact what happens when you\n 10935 10:37:50,322 --> 10:37:54,582 calling swap, passing in x and\ny, you are calling a function 10936 10:37:56,211 --> 10:37:57,972 And this is a term of\nart that just means 10937 10:37:57,972 --> 10:38:02,682 you are passing in copies of x and\n 10938 10:38:02,682 --> 10:38:06,912 A and B in the context of this\n 10939 10:38:06,911 --> 10:38:10,811 Now technically, these\nnames are local only. 10940 10:38:10,811 --> 10:38:13,572 I could have called this x,\nI could have called this y 10941 10:38:13,572 --> 10:38:17,891 I could have changed this to x,\n 10942 10:38:17,891 --> 10:38:19,391 The problem would still remain. 10943 10:38:19,391 --> 10:38:23,322 Just because you use the same names\n 10944 10:38:23,322 --> 10:38:24,911 that doesn't mean they're the same. 10945 10:38:24,911 --> 10:38:26,481 They just look the same to you. 10946 10:38:26,482 --> 10:38:31,182 But indeed, swap is going to get copies\n 10947 10:38:33,822 --> 10:38:36,161 x and y will be copies of the original. 10948 10:38:36,161 --> 10:38:38,501 So for clarity, let me\nrevert this back to A and B 10949 10:38:38,502 --> 10:38:42,311 just to make super clear that they're\n 10950 10:38:42,311 --> 10:38:44,261 but there's indeed a problem there. 10951 10:38:44,262 --> 10:38:46,402 This function actually works fine. 10952 10:38:47,722 --> 10:38:52,281 Let me go ahead and print out\ninside of this. printf A is %i 10953 10:38:52,281 --> 10:38:56,351 B is %i backslash n, and\nthen I'll print A and B. 10954 10:38:56,351 --> 10:38:59,561 And let me do that same thing at the\n 10955 10:39:02,112 --> 10:39:06,101 Make swap, ./swap,\nand this is promising. 10956 10:39:06,101 --> 10:39:12,731 Initially, x is 1, y is 2, A\nis 1, B is 2, A is 2, B is 1 10957 10:39:12,732 --> 10:39:14,959 but then nope-- x is 1, y is 2. 10958 10:39:14,959 --> 10:39:17,292 So if anything, I've confirmed\nthat the logic is right-- 10959 10:39:17,292 --> 10:39:20,412 Mariana's logic is right, but\nthere's something about C. 10960 10:39:20,411 --> 10:39:24,281 There's something about using one\n 10961 10:39:26,031 --> 10:39:30,381 The fact that I'm passing in copies of\n 10962 10:39:30,381 --> 10:39:31,752 So what in fact is going on? 10963 10:39:31,752 --> 10:39:34,572 Well again, inside of your computer's\n 10964 10:39:34,572 --> 10:39:36,447 and we've been talking\nabout them abstractly 10965 10:39:36,447 --> 10:39:38,502 it's just this grid of memory locations. 10966 10:39:38,502 --> 10:39:41,703 It turns out that your\ncomputer uses this memory 10967 10:39:41,703 --> 10:39:42,911 in a pretty conventional way. 10968 10:39:42,911 --> 10:39:46,991 It's not just random, where it just\n 10969 10:39:46,991 --> 10:39:50,951 it actually uses different parts of\n 10970 10:39:50,951 --> 10:39:54,341 And you have control over a lot of\n 10971 10:39:55,184 --> 10:39:56,891 And let's go ahead\nand zoom out from this 10972 10:39:56,891 --> 10:40:00,942 and consider that within your computer's\n 10973 10:40:00,942 --> 10:40:04,362 do is actually store initially,\nall of the zeros and ones 10974 10:40:04,362 --> 10:40:08,362 that you compiled in the top of\n 10975 10:40:08,362 --> 10:40:11,592 So when you compile a program and\n 10976 10:40:11,591 --> 10:40:15,011 or on a Mac or PC you double\nclick on it, the computer first-- 10977 10:40:15,012 --> 10:40:20,141 the operating system first-- loads all\n 10978 10:40:20,141 --> 10:40:24,731 Machine code, into just one big chunk\n 10979 10:40:24,732 --> 10:40:28,662 Below that it stores global\nvariables-- any variables 10980 10:40:28,661 --> 10:40:32,543 you have created in your program\n 10981 10:40:33,252 --> 10:40:35,052 Generally, the top of your file. 10982 10:40:35,052 --> 10:40:36,995 Globals tend to go at the top there. 10983 10:40:36,995 --> 10:40:39,912 Then there's this chunk of memory\n 10984 10:40:39,911 --> 10:40:42,311 and we saw that word\nbriefly in Valgin's output 10985 10:40:42,311 --> 10:40:45,941 and then there's this other\nchunk of memory called the stack. 10986 10:40:45,942 --> 10:40:51,072 And it turns out that up until this\n 10987 10:40:51,072 --> 10:40:56,322 Any time you use local variables in\n 10988 10:40:56,322 --> 10:41:00,042 Any time you use malloc, that\nmemory ends up on the heap. 10989 10:41:00,042 --> 10:41:02,112 Now as the arrow suggests,\nthis actually looks 10990 10:41:02,112 --> 10:41:05,194 like a problem waiting to happen because\n 10991 10:41:05,194 --> 10:41:07,031 heap, and more and more\nand more stack, it's 10992 10:41:07,031 --> 10:41:09,762 like two things barreling down the\n 10993 10:41:10,252 --> 10:41:11,502 And that's actually a problem. 10994 10:41:11,502 --> 10:41:14,841 If you've ever heard the phrase\n 10995 10:41:14,841 --> 10:41:16,631 this is the origin of its name. 10996 10:41:16,631 --> 10:41:18,881 When you start to use\nmore and more and more 10997 10:41:18,881 --> 10:41:21,161 memory by calling lots\nand lots of functions 10998 10:41:21,161 --> 10:41:23,621 or using lots and lots\nof local variables 10999 10:41:23,622 --> 10:41:25,872 you use a lot of this stack memory. 11000 10:41:25,872 --> 10:41:29,322 Or if you use malloc a lot and keep\n 11001 10:41:29,322 --> 10:41:33,042 and never really, or rarely calling\n 11002 10:41:33,042 --> 10:41:36,881 and eventually these two things might\n 11003 10:41:37,932 --> 10:41:40,552 The program will crash or\nsomething bad will happen. 11004 10:41:40,552 --> 10:41:43,332 So the onus is on you\njust to don't do that. 11005 10:41:43,332 --> 10:41:45,582 But this is the design,\ngenerally, of what's 11006 10:41:45,582 --> 10:41:47,472 going on inside of\nyour computer's memory. 11007 10:41:47,472 --> 10:41:51,072 Now within that memory, though,\nthere are certain conventions 11008 10:41:51,072 --> 10:41:52,932 focusing on here, the stack. 11009 10:41:52,932 --> 10:41:55,391 And in fact, let me go\nover here with a marker 11010 10:41:55,391 --> 10:41:58,881 and say that this represents the\n 11011 10:41:58,881 --> 10:42:03,161 And so here we have a whole bunch of\n 11012 10:42:03,161 --> 10:42:05,451 represents a byte of memory\nand this, for instance 11013 10:42:05,451 --> 10:42:08,141 might represent four bytes\naltogether-- good enough for an int 11014 10:42:09,472 --> 10:42:13,811 So in my original code that I wrote\n 11015 10:42:13,811 --> 10:42:16,211 what is in fact going on\ninside the swap function? 11016 10:42:16,211 --> 10:42:20,262 We can visualize it like this-- when\n 11017 10:42:20,262 --> 10:42:23,862 matter, main is the first function\n 11018 10:42:23,862 --> 10:42:27,372 and so I'm just going to label\n 11019 10:42:27,372 --> 10:42:31,742 And what were the two variables I\n 11020 10:42:33,561 --> 10:42:35,761 And each of those was an\nint, so that's four bytes 11021 10:42:35,762 --> 10:42:38,482 so it's deliberate\nthat I reserved four-- 11022 10:42:38,482 --> 10:42:41,312 a chunk of wood here that's four bytes. 11023 10:42:41,311 --> 10:42:45,261 So let me just call this x, and I'm just\n 11024 10:42:45,771 --> 10:42:49,791 And then I had my other variable y, and\n 11025 10:42:49,792 --> 10:42:54,002 What happens when main calls swap\n 11026 10:42:54,002 --> 10:43:00,292 Well, it has two variables of its\n 11027 10:43:00,292 --> 10:43:04,702 and B is initially 2, but it\nhas a third variable, tmp 11028 10:43:04,701 --> 10:43:07,731 which is a local variable in\naddition to the arguments A and B 11029 10:43:07,732 --> 10:43:12,292 that are passed in, so I'm going\n 11030 10:43:12,292 --> 10:43:13,516 And what is the value of tmp? 11031 10:43:13,516 --> 10:43:15,141 Well, we have to look back at the code. 11032 10:43:15,141 --> 10:43:19,792 tmp initially gets the value of A.\n 11033 10:43:21,502 --> 10:43:23,961 That's step one in my\nthree line program. 11034 10:43:23,961 --> 10:43:27,981 OK, A equals B. So that is assigned\n 11035 10:43:27,982 --> 10:43:31,612 into the A So B is 2, A is\nthis, so let me go ahead 11036 10:43:31,612 --> 10:43:33,722 and erase this and just overwrite that. 11037 10:43:33,722 --> 10:43:37,252 So at this moment in the story\nyou have two copies of two 11038 10:43:37,252 --> 10:43:40,072 so that's OK though, because\nthe third line of code 11039 10:43:40,072 --> 10:43:43,101 says tmp gets copied\ninto B. So what's tmp-- 11040 10:43:43,101 --> 10:43:48,531 1, gets copied into B, so let\nme overwrite this 2 with a 1 11041 10:43:50,182 --> 10:43:53,302 Now unfortunately, the code ends. 11042 10:43:53,302 --> 10:43:56,872 swap doesn't actually do anything\n 11043 10:43:56,872 --> 10:43:58,882 is that I could have had a return value. 11044 10:43:58,881 --> 10:44:01,101 I could go in there\nand change void to int 11045 10:44:01,101 --> 10:44:02,871 but which one am I going to return? 11046 10:44:04,582 --> 10:44:06,992 The whole goal is to\nswap two values, and it 11047 10:44:06,991 --> 10:44:08,991 seems kind of lame if you\ncan't write a function 11048 10:44:08,991 --> 10:44:12,021 to do something as common per\nlast week sorting algorithms 11049 10:44:14,902 --> 10:44:18,112 Well, even though when this\nprogram starts running 11050 10:44:18,112 --> 10:44:21,351 main is using this chunk of memory\n 11051 10:44:21,351 --> 10:44:24,021 and the stack is just like\na cafeteria stack of trays-- 11052 10:44:25,561 --> 10:44:27,651 Here's main's memory on the stack. 11053 10:44:27,652 --> 10:44:29,932 Here's the swap function's\nmemory on the stack. 11054 10:44:29,932 --> 10:44:32,601 It's using three ints instead of two-- 11055 10:44:34,311 --> 10:44:37,822 What happens when the function\n 11056 10:44:37,822 --> 10:44:41,061 The sort of recollection that\nthis is swap's memory goes away 11057 10:44:41,061 --> 10:44:42,651 and garbage values are left. 11058 10:44:42,652 --> 10:44:46,891 So, adorably, we get rid\nof these values here 11059 10:44:46,891 --> 10:44:51,351 and there's still data there--\n 11060 10:44:51,351 --> 10:44:54,951 are still there in the computer's\n 11061 10:44:54,951 --> 10:44:56,701 because the function has now returned. 11062 10:44:56,701 --> 10:44:59,781 So they're still in there and this\n 11063 10:44:59,781 --> 10:45:03,141 of why there's other stuff in memory\n 11064 10:45:03,982 --> 10:45:06,432 Sometimes you did put\nit there, but now once 11065 10:45:06,432 --> 10:45:10,072 swap returns you only should be\ntouching memory inside of main. 11066 10:45:10,072 --> 10:45:14,362 But we've never actually\ncopied one value into main. 11067 10:45:14,362 --> 10:45:18,022 We haven't returned anything and we\n 11068 10:45:19,652 --> 10:45:23,662 Well, what if we instead passed\ninto swap not copies of x and y 11069 10:45:23,661 --> 10:45:28,041 calling them A and B. What if they\n 11070 10:45:28,042 --> 10:45:31,222 sort of a treasure map that\nwill lead swap to the actual x 11071 10:45:32,601 --> 10:45:36,411 Today we have that\ncapability using pointers. 11072 10:45:36,411 --> 10:45:40,281 So suppose that we\nuse this code instead. 11073 10:45:40,281 --> 10:45:43,192 There's a lot of stars going on\nhere, which is a bit annoying 11074 10:45:43,192 --> 10:45:45,862 but let's consider what it\nis we're trying to achieve. 11075 10:45:45,862 --> 10:45:50,752 What if we pass in not x and y, but\n 11076 10:45:50,752 --> 10:45:52,862 respectively--\nbreadcrumbs, if you will-- 11077 10:45:52,862 --> 10:45:55,881 that will lead swap to\nthe original values. 11078 10:45:55,881 --> 10:45:59,691 Then what we do is we still\ngive ourselves a tmp variable 11079 10:46:00,711 --> 10:46:03,051 It's still a glass, so\nwe still call it an int 11080 10:46:03,052 --> 10:46:05,432 but what do we want to put\ninto that temporary variable? 11081 10:46:05,432 --> 10:46:08,014 We don't want to put A into it,\nbecause that's an address now. 11082 10:46:08,014 --> 10:46:10,731 We want to go to that\naddress per the star 11083 10:46:10,732 --> 10:46:12,502 and put whatever's at that address. 11084 10:46:13,741 --> 10:46:17,481 Well, we want to then copy\ninto whatever's at location A 11085 10:46:17,482 --> 10:46:20,272 we want to copy over to\nlocation A's contents 11086 10:46:20,271 --> 10:46:24,471 whatever is at location B's\ncontents and then lastly, we 11087 10:46:24,472 --> 10:46:27,622 want to copy tmp into\nwhatever's at location B. 11088 10:46:27,622 --> 10:46:31,510 So again, we're very deliberately\nintroducing all of these stars 11089 10:46:31,510 --> 10:46:33,802 because we don't want to\nchange any of these addresses 11090 10:46:33,802 --> 10:46:37,222 we want to go to these addresses\nper the reference operator 11091 10:46:37,222 --> 10:46:41,582 and put values there,\nor get values from. 11092 10:46:41,582 --> 10:46:43,052 So what does this actually mean? 11093 10:46:43,052 --> 10:46:47,362 Well, if I kind of rewind in this story\n 11094 10:46:47,362 --> 10:46:53,031 although I'm going to delete its\n 11095 10:46:53,031 --> 10:46:56,481 and I still have A, but\nwhat's going to be different 11096 10:46:56,482 --> 10:47:00,412 this time is how I use A and B.\nSo let me finish erasing those. 11097 10:47:00,411 --> 10:47:02,541 That's A on the left,\nthis is B on the right. 11098 10:47:02,542 --> 10:47:05,061 At this point in the\nstory, we're rerunning swap 11099 10:47:05,061 --> 10:47:08,511 with this new and improved version,\nand let's see what happens. 11100 10:47:08,512 --> 10:47:12,232 Well, x is presumably at some address. 11101 10:47:12,232 --> 10:47:15,712 Maybe it's like 0x123, as always. 11102 10:47:15,711 --> 10:47:18,832 What then does A get\nwhen I'm using this code? 11103 10:47:28,012 --> 10:47:33,641 Well, I'm going to put 0x456,\nand the what am I going to do? 11104 10:47:33,641 --> 10:47:35,832 Based on these three\nlines of code, I'm going 11105 10:47:35,832 --> 10:47:40,031 to store in tmp whatever is at the\n 11106 10:47:40,031 --> 10:47:43,061 That's this thing here, so\nI'm going to put 1 in tmp. 11107 10:47:43,061 --> 10:47:45,612 Line two-- I'm going to go to B-- 11108 10:47:45,612 --> 10:47:48,491 all right, B is 456, so\nI'm going to B and I'm 11109 10:47:48,491 --> 10:47:53,292 going to store 2 at whatever is\nat location A, and at location A 11110 10:47:53,292 --> 10:47:56,572 is 123, so that's this,\nso what am I going to do? 11111 10:47:56,572 --> 10:47:59,262 I'm going to change this 1 to a 2. 11112 10:47:59,262 --> 10:48:01,992 Last line of code-- get the\nvalue of tmp, which is 1 11113 10:48:01,991 --> 10:48:07,091 and then put it at whatever the\n 11114 10:48:07,091 --> 10:48:11,651 and change it to be the value\nof tmp, tmp, which puts 1 here. 11115 10:48:12,881 --> 10:48:14,441 There's still no return value. 11116 10:48:14,442 --> 10:48:17,742 swap returns, which means\nthese three temporary variables 11117 10:48:19,451 --> 10:48:21,831 They can be reused by\nsubsequent function calls 11118 10:48:21,832 --> 10:48:26,452 but now, I've actually\nswapped the values of x and y. 11119 10:48:26,451 --> 10:48:30,401 Which is to say what came as naturally\n 11120 10:48:30,402 --> 10:48:33,882 is not quite as simply\ndone in C because again 11121 10:48:33,881 --> 10:48:36,221 functions are isolated from each other. 11122 10:48:36,222 --> 10:48:39,502 You can pass in values but you\nget copies of those values. 11123 10:48:39,502 --> 10:48:44,052 If you want one function to affect the\n 11124 10:48:44,052 --> 10:48:47,382 you have to 1, understand\nwhat's going on but 2 11125 10:48:47,381 --> 10:48:50,331 pass things in as by a pointer here. 11126 10:48:50,332 --> 10:48:53,921 So if I go back to my code here,\n 11127 10:48:53,921 --> 10:48:56,021 Let me get rid of these extra printf's. 11128 10:48:56,021 --> 10:48:58,752 Let me go in and add all these stars. 11129 10:48:58,752 --> 10:49:02,771 So I'm dereferencing these\nactual addresses here and here 11130 10:49:02,771 --> 10:49:05,182 and I've got to make one more change. 11131 10:49:05,182 --> 10:49:11,741 How do I now call swap if swap is\n 11132 10:49:11,741 --> 10:49:14,801 That is, the address of an int\nand the address of another int. 11133 10:49:14,802 --> 10:49:17,292 What do I change on line 11 here? 11134 10:49:25,591 --> 10:49:28,411 DAVID J. MALAN: Sorry,\nthe address of operator. 11135 10:49:28,411 --> 10:49:33,091 So up here on line 11, we do\nampersand x and ampersand y. 11136 10:49:33,091 --> 10:49:36,361 So that yes, we're technically\npassing in a copy of a value 11137 10:49:36,362 --> 10:49:39,241 but this time the copy we're passing\n 11138 10:49:39,241 --> 10:49:42,631 and as soon as we have an address, just\n 11139 10:49:42,631 --> 10:49:45,932 the foamy finger-- I can point at\n 11140 10:49:45,932 --> 10:49:49,921 and actually get a value from the\n 11141 10:49:52,182 --> 10:49:56,912 So let's cross our fingers\nnow and do make swap, Enter. 11142 10:49:56,911 --> 10:49:58,081 Oh my God, so many mistakes. 11143 10:49:58,082 --> 10:50:00,242 Oh, I didn't remember\nto change my prototype 11144 10:50:00,241 --> 10:50:03,781 so let me go way up here and\nadd two more stars because I 11145 10:50:05,161 --> 10:50:10,322 Make swap, ./swap, and viola--\nnow I have actually swapped. 11146 10:50:17,021 --> 10:50:19,851 All right, so what more can we do here? 11147 10:50:19,851 --> 10:50:24,822 Well, let me consider\nthat all this time we've 11148 10:50:24,822 --> 10:50:29,052 been deliberately using\nGetString and GetInt and GetFloat 11149 10:50:29,052 --> 10:50:30,472 and so forth, but for a reason. 11150 10:50:30,472 --> 10:50:33,430 These aren't just training wheels\n 11151 10:50:33,430 --> 10:50:36,432 they're actually in place\nto make your code safer. 11152 10:50:36,432 --> 10:50:40,872 And to illustrate this, let me go\n 11153 10:50:40,872 --> 10:50:45,222 How about a file called scanf.c. 11154 10:50:45,222 --> 10:50:48,252 It turns out that the old\nschool way-- the way in C 11155 10:50:48,252 --> 10:50:52,512 really, of getting user input,\nis via functions like scanf 11156 10:50:52,512 --> 10:50:56,112 and let me go ahead and include\nstdio.h, int main(void) 11157 10:50:56,112 --> 10:50:59,802 and without using the CS50 library at\n 11158 10:51:00,972 --> 10:51:03,522 Let me give myself an int called x. 11159 10:51:03,521 --> 10:51:07,436 Let me just print out what the value of\n 11160 10:51:07,436 --> 10:51:10,722 or rather, ask the user for\nthe value by asking them for x. 11161 10:51:10,722 --> 10:51:14,141 And I'm going to use a function\n 11162 10:51:14,141 --> 10:51:20,711 in an integer using %i, and I'm going\n 11163 10:51:22,667 --> 10:51:25,542 And then I'm going to go ahead and,\n 11164 10:51:25,542 --> 10:51:29,592 I'm going to print out with %i\n 11165 10:51:29,591 --> 10:51:32,681 All right, so line eight\nis week 1 style code. 11166 10:51:32,682 --> 10:51:36,351 Line five and six is week 1 style code. 11167 10:51:36,351 --> 10:51:41,771 So the curiosity today is this new line.\n 11168 10:51:43,332 --> 10:51:46,031 I'm using the same syntax\nthat I use for printf 11169 10:51:46,031 --> 10:51:49,451 which is kind of a little clue-- a\n 11170 10:51:49,451 --> 10:51:52,391 want to scan in, that is, read\nfrom the human's keyboard-- 11171 10:51:52,391 --> 10:51:55,932 and I'm telling it where to put\nwhatever the human typed in. 11172 10:51:55,932 --> 10:51:59,682 I can't just say x, because we run into\n 11173 10:51:59,682 --> 10:52:02,171 I have to give a little\nbreadcrumb to the variable 11174 10:52:02,171 --> 10:52:05,472 where I want scanf to\nput the human's integer. 11175 10:52:05,472 --> 10:52:08,902 And so this just tells the\ncomputer to get an int. 11176 10:52:08,902 --> 10:52:11,141 This is what you would have\nhad to type, essentially 11177 10:52:11,141 --> 10:52:14,052 in week 1 just to get\nan int from the user 11178 10:52:14,052 --> 10:52:16,902 and there's a whole bunch of\nthings that can go wrong still 11179 10:52:16,902 --> 10:52:20,292 but that's the cryptic syntax we\n 11180 10:52:20,292 --> 10:52:22,241 Let me go ahead and make scanf here-- 11181 10:52:25,302 --> 10:52:27,252 Put the semicolon in the wrong place. 11182 10:52:30,641 --> 10:52:32,036 Non void doesn't return a value. 11183 10:52:42,332 --> 10:52:45,312 I'm going to type in a number like\n 11184 10:52:45,311 --> 10:52:49,542 So that is the traditional way of\n 11185 10:52:49,542 --> 10:52:53,012 The problem, though, is when you\n 11186 10:52:54,482 --> 10:52:56,650 Let me delete all of\nthis and give myself 11187 10:52:56,650 --> 10:52:59,192 a string s, although wait a\nminute-- we don't call it strings 11188 10:52:59,192 --> 10:53:02,252 anymore-- char star to store a string. 11189 10:53:02,252 --> 10:53:06,091 Then let me go ahead and just prompt the\n 11190 10:53:06,091 --> 10:53:10,891 Then let me go ahead and use scanf, ask\n 11191 10:53:10,891 --> 10:53:13,572 and store it at that address. 11192 10:53:13,572 --> 10:53:16,112 Then let me go ahead and print\nout whatever the human typed 11193 10:53:16,112 --> 10:53:19,002 in just by using the same notation. 11194 10:53:19,002 --> 10:53:24,152 So here, line five is the same thing\n 11195 10:53:24,152 --> 10:53:26,552 that layer today so it's char star s. 11196 10:53:26,552 --> 10:53:31,352 This is just week one this is\njust week one, line seven is new. 11197 10:53:31,351 --> 10:53:37,171 scanf will also read from the human's\n 11198 10:53:37,171 --> 10:53:39,002 But that's OK, because s is an address. 11199 10:53:39,002 --> 10:53:41,912 It's correct not to do the ampersand. 11200 10:53:42,811 --> 10:53:47,432 A string is and has always\nbeen a char star, a.k.a string. 11201 10:53:47,432 --> 10:53:49,451 The problem, though, arises as follows-- 11202 10:53:51,771 --> 10:53:53,271 oh my God, what did I do wrong-- 11203 10:53:53,271 --> 10:53:55,791 I can't-- OK, we have certain\ndefenses in place with make. 11204 10:53:55,792 --> 10:54:02,241 Let me do clang of scanf.c, an\noutput of program called scanf. 11205 10:54:02,241 --> 10:54:05,199 All right, so I'm overriding\nsome of our pedagogical defenses 11206 10:54:05,199 --> 10:54:06,531 that we have in place with make. 11207 10:54:06,531 --> 10:54:11,121 Let me now run scanf of this version,\n 11208 10:54:15,701 --> 10:54:18,521 So it didn't even store something\n 11209 10:54:18,521 --> 10:54:22,182 This time it's in lowercase,\nbut that is somewhat related. 11210 10:54:22,182 --> 10:54:26,921 What did I fundamentally\ndo wrong though, here? 11211 10:54:26,921 --> 10:54:29,051 Why is this getting\nmore and more dangerous? 11212 10:54:29,052 --> 10:54:30,832 And let me illustrate\nthe point even more. 11213 10:54:30,832 --> 10:54:34,101 What if I type in not just something\n 11214 10:54:34,101 --> 10:54:39,941 What if I do like, hellooooo and\n 11215 10:54:45,451 --> 10:54:48,631 Right, a really long,\nunexpectedly long string. 11216 10:54:48,631 --> 10:54:50,491 This is the nondeterminism kicking in. 11217 10:54:51,781 --> 10:54:53,614 I was trying to trigger\na segmentation fault 11218 10:54:53,614 --> 10:54:56,851 but it wouldn't, but\nthe point still remains. 11219 10:54:56,851 --> 10:55:01,542 It's still not working, but what's\n 11220 10:55:01,542 --> 10:55:03,211 and it's not storing my actual input? 11221 10:55:04,091 --> 10:55:06,026 AUDIENCE: Do you have to make a space? 11222 10:55:06,027 --> 10:55:07,902 DAVID J. MALAN: We have\nto make space for it. 11223 10:55:07,902 --> 10:55:11,141 So what we're missing here is\nmalloc, or something like that. 11224 10:55:11,141 --> 10:55:14,101 So I could do that, I could\ndo something like this. 11225 10:55:14,101 --> 10:55:16,801 Well, let the human type in\nat least a three letter word 11226 10:55:16,802 --> 10:55:20,942 so I could do malloc of 3\nplus 1 for the null character. 11227 10:55:20,942 --> 10:55:25,322 So let me give them four characters,\n 11228 10:55:26,281 --> 10:55:28,442 Nope, sorry. clang, I have to-- 11229 10:55:29,582 --> 10:55:36,171 Oh, include stdlib.h-- there we go. 11230 10:55:36,171 --> 10:55:39,197 That gives me malloc, now I'm\n 11231 10:55:39,197 --> 10:55:42,322 now I'm going to rerun it, and now I'm\n 11232 10:55:43,701 --> 10:55:47,421 And let me get a little aggressive now\n 11233 10:55:47,421 --> 10:55:49,461 Still works, but I'm getting lucky. 11234 10:55:53,031 --> 10:55:55,355 Damn it, that still works, too. 11235 10:55:56,451 --> 10:55:58,650 But it actually-- not quite. 11236 10:55:58,650 --> 10:56:00,771 There's some weirdness\ngoing on there already. 11237 10:56:00,771 --> 10:56:02,371 It turns out I can also do this. 11238 10:56:02,372 --> 10:56:05,751 I could actually just say\nchar star four and give myself 11239 10:56:05,750 --> 10:56:07,041 an array of four characters. 11240 10:56:07,042 --> 10:56:08,461 Let me try this one more time. 11241 10:56:08,461 --> 10:56:12,021 So let me rerun clang ./scanf. 11242 10:56:12,021 --> 10:56:16,820 Hellooooooo, clearly exceeding\nthe four characters-- 11243 10:56:22,182 --> 10:56:24,703 So the point here, though, is\nif we hadn't given you GetInt 11244 10:56:24,703 --> 10:56:27,161 you would have had to use the\nscanf thing-- not a huge deal 11245 10:56:28,432 --> 10:56:31,682 But if we hadn't given you GetString you\n 11246 10:56:31,682 --> 10:56:34,842 knowing about malloc already or\n 11247 10:56:34,841 --> 10:56:36,911 and even now there's a danger. 11248 10:56:36,911 --> 10:56:41,112 If the human types in five letters,\n 11249 10:56:41,112 --> 10:56:44,862 like with the Hello input, will\n 11250 10:56:44,862 --> 10:56:46,842 So GetString also has\nthis functionality built 11251 10:56:46,841 --> 10:56:49,150 in where we have a\nfancy loop inside such 11252 10:56:49,150 --> 10:56:53,682 that we allocate using malloc as\n 11253 10:56:53,682 --> 10:56:55,631 and we use malloc\nessentially every keystroke. 11254 10:56:55,631 --> 10:57:00,461 The moment you type in h-e-l-l-o, we're\n 11255 10:57:00,461 --> 10:57:04,932 allocating more and more memory so that\n 11256 10:57:04,932 --> 10:57:07,661 GetString even though\nit's this easy to crack-- 11257 10:57:07,661 --> 10:57:10,811 this easy to crash your code\nusing scanf if you again 11258 10:57:10,811 --> 10:57:13,481 did it without the help of a library. 11259 10:57:13,482 --> 10:57:15,539 So where are we all going with this? 11260 10:57:15,538 --> 10:57:17,621 Well, let me show you a\nfew final examples that'll 11261 10:57:17,622 --> 10:57:19,961 pave the way for what\nwill be problem set four. 11262 10:57:19,961 --> 10:57:23,122 Let me go ahead and open\nup from today's code-- 11263 10:57:23,122 --> 10:57:25,241 which is available on\nthe course's website-- 11264 10:57:25,241 --> 10:57:32,202 for instance, a program like\nthis, called phonebook.c 11265 10:57:32,201 --> 10:57:34,900 and I'm just going to give\nyou a quick tour of it 11266 10:57:34,900 --> 10:57:37,862 that you'll see more details on in\n 11267 10:57:37,862 --> 10:57:40,570 We're going to introduce a few\n 11268 10:57:40,571 --> 10:57:43,812 You're going to see a function called\n 11269 10:57:43,811 --> 10:57:47,203 and it takes two arguments-- the\n 11270 10:57:47,203 --> 10:57:50,411 that you might manipulate in Excel or\n 11271 10:57:50,411 --> 10:57:55,211 separated values, and then something\n 11272 10:57:55,211 --> 10:57:58,150 W for write, depending on whether\nyou want to add to the file 11273 10:57:58,150 --> 10:58:00,682 just open it up, or change it. 11274 10:58:00,682 --> 10:58:03,192 We're going to introduce\nyou to a file pointer. 11275 10:58:03,192 --> 10:58:05,031 You'll see that capital file-- 11276 10:58:05,031 --> 10:58:07,631 which is a little bit\nunconventional-- capital file is 11277 10:58:07,631 --> 10:58:10,481 a pointer to an actual file\non the computer's hard drive 11278 10:58:10,482 --> 10:58:13,001 so that you can actually access\nsomething like a CSV file 11279 10:58:14,351 --> 10:58:16,661 And we're going to see\ndown below that you're also 11280 10:58:16,661 --> 10:58:20,411 going to have the ability to write\n 11281 10:58:20,411 --> 10:58:24,341 You'll see functions like\nprintf printf for file printf. 11282 10:58:24,341 --> 10:58:29,471 Or fwrite-- file write-- which now that\n 11283 10:58:29,472 --> 10:58:33,311 you'll have the ability to\nactually not only read files-- 11284 10:58:33,311 --> 10:58:36,830 text files, images, other\nthings-- but also write them out. 11285 10:58:36,830 --> 10:58:42,281 In fact for instance, just as a teaser\n 11286 10:58:42,281 --> 10:58:44,682 we focus on this week where\nwe give you a forensic image 11287 10:58:44,682 --> 10:58:47,351 and your goal is to\nrecover as many photographs 11288 10:58:47,351 --> 10:58:51,012 from this forensic image of a\n 11289 10:58:51,012 --> 10:58:54,432 And the way you're going to do\nthat is by knowing in advance 11290 10:58:54,432 --> 10:58:58,932 that every JPEG in the world starts\n 11291 10:58:58,932 --> 10:59:01,161 in hexadecimal, but these three numbers. 11292 10:59:01,161 --> 10:59:03,881 And so in fact, just as\na teaser, let me open up 11293 10:59:03,881 --> 10:59:07,061 an example you'll see on the\ncourse's website for today. 11294 10:59:07,061 --> 10:59:09,796 If I scroll through here,\nyou'll see a program 11295 10:59:09,796 --> 10:59:11,421 that does a little something like this. 11296 10:59:13,572 --> 10:59:15,762 if we could hit the button-- 11297 10:59:16,402 --> 10:59:21,582 So here we have the notion of a byte\n 11298 10:59:21,582 --> 10:59:24,461 We'll see a data type called byte,\nwhich is a common convention. 11299 10:59:25,701 --> 10:59:28,034 And you're going to learn\nabout a function called fread 11300 10:59:28,035 --> 10:59:31,932 which reads from a file some number\n 11301 10:59:31,932 --> 10:59:33,701 We might then use code like this. 11302 10:59:33,701 --> 10:59:37,362 If bytes bracket zero\nequals equals 0xFF and bytes 11303 10:59:37,362 --> 10:59:43,122 bracket 1 equals 0xD8 and bytes bracket\n 11304 10:59:43,122 --> 10:59:47,842 bytes I just claimed represent a\n 11305 10:59:47,841 --> 10:59:51,171 Let me go ahead and run\nthis program as follows. 11306 10:59:51,171 --> 10:59:55,281 Let me copy jpeg.c into my\ndirectory from today's distribution. 11307 10:59:55,281 --> 11:00:03,432 Let me do make jpeg, and let me run\n 11308 11:00:03,432 --> 11:00:07,201 called lecture.jpeg, and I\nclaim yes, it's possibly a JPEG. 11309 11:00:08,201 --> 11:00:11,841 Let me open it up for us, called\n 11310 11:00:11,841 --> 11:00:15,941 is that same photo with which we began\n 11311 11:00:15,942 --> 11:00:18,072 But what we're also\ngoing to do this week 11312 11:00:18,072 --> 11:00:22,991 is start to implement our own sort\n 11313 11:00:22,991 --> 11:00:26,261 we might take images and actually\n 11314 11:00:26,262 --> 11:00:28,279 creates different versions thereof. 11315 11:00:28,279 --> 11:00:30,072 For instance, using a\ndifferent file format 11316 11:00:30,072 --> 11:00:33,862 called BMP, which essentially lays out\n 11317 11:00:35,262 --> 11:00:36,822 You're going to see a struct-- 11318 11:00:36,822 --> 11:00:38,862 a data struct in C that's\nway more complicated 11319 11:00:38,862 --> 11:00:40,991 than the candidate\nstructure from the past 11320 11:00:40,991 --> 11:00:43,226 or the person structure\nfrom the past, that 11321 11:00:43,226 --> 11:00:45,851 looks like this, which is just\na whole bunch more values in it 11322 11:00:45,851 --> 11:00:47,769 but we'll walk you through\nthese in the p-set. 11323 11:00:47,769 --> 11:00:49,781 And we might take a\nphotograph like this and ask 11324 11:00:49,781 --> 11:00:52,241 you to run a few different\nfilters on it a la Instagram 11325 11:00:52,241 --> 11:00:55,871 like a black and white filter,\nor grayscale, a sepia filter 11326 11:00:55,872 --> 11:00:59,891 to give it some old school feel, or\n 11327 11:00:59,891 --> 11:01:02,481 or blur it, even in this way. 11328 11:01:02,482 --> 11:01:05,472 And just to end on a note\nhere, I have a version 11329 11:01:05,472 --> 11:01:08,982 of this code ready to go that doesn't\n 11330 11:01:08,982 --> 11:01:11,712 it just implements one filter initially. 11331 11:01:11,711 --> 11:01:14,411 Let me go ahead and just ready\nthis on my computer here. 11332 11:01:14,411 --> 11:01:16,466 I'm going to go into my\nown version of filter 11333 11:01:16,466 --> 11:01:18,341 and you'll see a few\nfiles that will give you 11334 11:01:18,341 --> 11:01:21,981 a tour of this coming week\nin bitmap.h, for instance 11335 11:01:21,982 --> 11:01:26,872 is a version of this structure that\n 11336 11:01:26,872 --> 11:01:34,722 And let me show you this file here,\n 11337 11:01:34,722 --> 11:01:38,412 called filter that I've already\nimplemented in advance today. 11338 11:01:38,411 --> 11:01:41,471 But the ones we give you for the piece\n 11339 11:01:41,472 --> 11:01:43,847 this function called filter\ntakes the height of an image 11340 11:01:43,847 --> 11:01:46,942 the width of an image, and\na two dimensional array. 11341 11:01:46,942 --> 11:01:49,932 So rows and columns\nof pixels, and then I 11342 11:01:49,932 --> 11:01:53,771 have a loop like this that iterates over\n 11343 11:01:55,402 --> 11:01:57,372 And then notice what\nI'm going to do here. 11344 11:01:57,372 --> 11:02:00,552 I'm going to change the blue\nvalue to be zero in this case 11345 11:02:00,552 --> 11:02:02,962 and the green value to\nbe zero in this case. 11346 11:02:03,701 --> 11:02:07,451 Well, the image I have\nhere in mind is this one 11347 11:02:07,451 --> 11:02:10,241 whereby we have this\nhidden image that simply 11348 11:02:10,241 --> 11:02:13,511 has old school style-- a\nsecret message embedded in it. 11349 11:02:13,512 --> 11:02:16,722 And if you don't happen to have in\n 11350 11:02:16,722 --> 11:02:18,942 glasses that essentially\nmake everything red-- 11351 11:02:18,942 --> 11:02:21,817 getting rid of the green in the\n 11352 11:02:21,817 --> 11:02:24,192 you can actually-- I'm actually\nprobably the only one who 11353 11:02:24,192 --> 11:02:26,472 can read this right\nnow-- see what message 11354 11:02:26,472 --> 11:02:28,752 is hidden behind all of this red noise. 11355 11:02:28,752 --> 11:02:34,482 But if using my code written here in\n 11356 11:02:34,482 --> 11:02:37,182 in the picture and I get rid of\nall the green in the picture 11357 11:02:37,182 --> 11:02:39,792 essentially implementing\nthe idea of this filter-- 11358 11:02:39,792 --> 11:02:42,612 this red filter where you only see red-- 11359 11:02:42,612 --> 11:02:45,862 well, let's go ahead and\ncompile this program. 11360 11:02:45,862 --> 11:02:50,832 Make filter, run ./filter\non this hidden message.bmp. 11361 11:02:50,832 --> 11:02:53,891 I'm going to save it in a\nnew file called message.bmp 11362 11:02:53,891 --> 11:02:56,832 and with one final flourish\nwe're going to open up 11363 11:02:56,832 --> 11:03:00,732 message.bmp, which is the result\nof having put on these glasses 11364 11:03:00,732 --> 11:03:03,882 and hopefully now you\ntoo will see what I see. 11365 11:03:12,891 --> 11:03:14,292 All right, that's it for CS50! 11366 11:04:39,311 --> 11:04:42,222 And this is already week 5,\nwhich means this is actually 11367 11:04:44,722 --> 11:04:48,552 In fact, in just a few days'\ntime, what has looked like this 11368 11:04:48,552 --> 11:04:50,972 and much more cryptic\nthan this perhaps, is 11369 11:04:50,972 --> 11:04:53,472 going to be distilled into\nsomething much simpler next week. 11370 11:04:53,472 --> 11:04:55,632 When we transition to a\nlanguage called Python. 11371 11:04:55,631 --> 11:04:59,951 And with Python, we'll still have our\n 11372 11:05:00,654 --> 11:05:03,822 But a lot of the low-level plumbing\n 11373 11:05:03,822 --> 11:05:06,502 struggling with, frustrated by,\nover the past couple of weeks 11374 11:05:06,502 --> 11:05:08,802 especially, now that\nwe've introduced pointers. 11375 11:05:08,802 --> 11:05:11,682 And it feels like you probably\nhave to do everything yourself. 11376 11:05:11,682 --> 11:05:14,542 In Python, and in a lot\nof higher level languages 11377 11:05:14,542 --> 11:05:16,932 so to speak-- more modern,\nmore recent languages 11378 11:05:16,932 --> 11:05:20,021 you'll be able to do so much more\n 11379 11:05:20,021 --> 11:05:23,021 And indeed, we're going to start\n 11380 11:05:24,461 --> 11:05:27,641 Frameworks, which is collections of\n 11381 11:05:27,641 --> 11:05:31,091 And on top of all that, will you be\n 11382 11:05:31,091 --> 11:05:34,691 impressive projects, that actually solve\n 11383 11:05:34,692 --> 11:05:37,582 Particularly, by way of\nyour own final project. 11384 11:05:37,582 --> 11:05:41,082 So last week though, in week 4,\n 11385 11:05:41,082 --> 11:05:43,692 And we've been treating this\nmemory inside of your computer 11386 11:05:45,042 --> 11:05:48,252 At the end of the day, it's just\n 11387 11:05:48,252 --> 11:05:51,381 And it's really up to you\nwhat you do with those bytes. 11388 11:05:51,381 --> 11:05:54,881 And how you interconnect them, how\n 11389 11:05:54,881 --> 11:05:56,959 And arrays, were like\none of the simplest ways. 11390 11:05:56,959 --> 11:05:58,752 We started playing\naround with that memory. 11391 11:05:58,752 --> 11:06:00,641 Just contiguous chunks of memory. 11392 11:06:01,781 --> 11:06:04,512 But let's consider, for a\nmoment, some of the problems that 11393 11:06:04,512 --> 11:06:06,101 pretty quickly arise with arrays. 11394 11:06:06,101 --> 11:06:09,671 And then, today focus on what more\n 11395 11:06:09,671 --> 11:06:14,591 Using your computer's memory as\na much more versatile canvas 11396 11:06:14,591 --> 11:06:16,861 to create even\ntwo-dimensional structures. 11397 11:06:16,862 --> 11:06:18,612 To represent information,\nand, ultimately 11398 11:06:18,612 --> 11:06:20,692 to solve more interesting problems. 11399 11:06:20,692 --> 11:06:22,272 So here's an array of size 3. 11400 11:06:22,271 --> 11:06:24,072 Maybe, the size of 3 integers. 11401 11:06:24,072 --> 11:06:26,319 And suppose that this\nis inside of a program. 11402 11:06:26,319 --> 11:06:29,112 And at this point in the story,\n 11403 11:06:30,521 --> 11:06:34,558 And suppose, whatever the context,\n 11404 11:06:36,432 --> 11:06:39,449 Well, instinctively, where\nshould the number 4 go? 11405 11:06:39,449 --> 11:06:41,531 If this is your computer's\nmemory and we currently 11406 11:06:41,531 --> 11:06:43,241 have this array 1, 2, 3, from what. 11407 11:06:44,591 --> 11:06:47,822 Where should the number 4\njust, perhaps, naively go. 11408 11:06:52,061 --> 11:06:53,502 So you could replace number 1. 11409 11:06:53,502 --> 11:06:55,377 I don't really like\nthat, though, because I'd 11410 11:06:55,377 --> 11:06:56,771 like to keep number 1 around. 11411 11:06:58,061 --> 11:06:59,811 But I'm losing, of course, information. 11412 11:06:59,811 --> 11:07:02,271 So what else could I do if\nI want to add the number 4. 11413 11:07:02,771 --> 11:07:04,146 AUDIENCE: On the right side of 3. 11414 11:07:04,813 --> 11:07:06,953 So, I mean, it feels like\nif there's some ordering 11415 11:07:06,953 --> 11:07:09,161 to these, which seems kind\nof a reasonable inference 11416 11:07:09,161 --> 11:07:11,261 that it probably belongs\nsomewhere over here. 11417 11:07:11,262 --> 11:07:14,742 But recall last week, as we started\n 11418 11:07:14,741 --> 11:07:16,612 there's other stuff\npotentially going on. 11419 11:07:16,612 --> 11:07:20,232 And if fill that in, ideally, we'd\n 11420 11:07:20,232 --> 11:07:22,062 If we're maintaining this kind of order. 11421 11:07:22,061 --> 11:07:24,461 But recall in the context\nof your computer's memory 11422 11:07:24,461 --> 11:07:25,902 there might be other stuff there. 11423 11:07:25,902 --> 11:07:28,414 Some of these garbage\nvalues that might be usable 11424 11:07:28,413 --> 11:07:30,371 but we don't really know\nor care what they are. 11425 11:07:30,372 --> 11:07:31,961 As represented by Oscar here. 11426 11:07:31,961 --> 11:07:34,991 But there might actually\nbe useful data in use. 11427 11:07:34,991 --> 11:07:38,381 Like, if your program has not\njust a few integers in this array 11428 11:07:38,381 --> 11:07:40,511 but also a string that\nsays like, "Hello, world. 11429 11:07:40,512 --> 11:07:46,572 It could be that your computer has\n 11430 11:07:48,192 --> 11:07:50,442 Well, maybe, you created the\narray in one line of code 11431 11:07:52,091 --> 11:07:54,491 Maybe the next line of\ncode used GET-STRING. 11432 11:07:54,491 --> 11:07:57,711 Or maybe just hard coded a string\n 11433 11:07:57,711 --> 11:08:00,459 And so you painted yourself\ninto a corner, so to speak. 11434 11:08:00,459 --> 11:08:03,042 Now I think you might claim,\nwell, let's just overwrite the H. 11435 11:08:03,042 --> 11:08:04,991 But that's problematic\nfor the same reasons. 11436 11:08:06,711 --> 11:08:09,612 So where else could the 4 go? 11437 11:08:09,612 --> 11:08:12,851 Or how do we solve this problem\nif we want to add a number 11438 11:08:12,851 --> 11:08:14,561 and there's clearly memory available. 11439 11:08:14,561 --> 11:08:17,951 Because those garbage values are junk\n 11440 11:08:17,951 --> 11:08:20,081 So we could certainly reuse those. 11441 11:08:20,082 --> 11:08:23,722 Where could the 4, and\nperhaps this whole array, go? 11442 11:08:24,222 --> 11:08:26,052 So I'm hearing we could\nmove it somewhere. 11443 11:08:26,052 --> 11:08:27,885 Maybe, replace some of\nthose garbage values. 11444 11:08:27,885 --> 11:08:29,902 And honestly, we have a lot of options. 11445 11:08:29,902 --> 11:08:32,141 We could use any of these\ngarbage values up here. 11446 11:08:32,141 --> 11:08:34,881 We could use any of these down\nhere, or even further down. 11447 11:08:34,881 --> 11:08:38,441 The point is there is plenty\nof memory available as 11448 11:08:38,442 --> 11:08:41,891 indicated by these Oscars, where\nwe could put 4, maybe even, 5 11449 11:08:43,271 --> 11:08:46,451 The catch is that we\nchose poorly early on. 11450 11:08:47,531 --> 11:08:51,167 And 1, 2, 3 ended up back-to-back with\n 11451 11:08:52,250 --> 11:08:55,060 Let's go ahead and assume that\n 11452 11:08:55,061 --> 11:08:58,226 And we'll plop the new\narray in this location here. 11453 11:08:58,226 --> 11:09:00,101 So I'm going to go ahead\nand copy the 1 over. 11454 11:09:01,902 --> 11:09:04,634 And then, ultimately, once\nI'm ready to fill the 4 11455 11:09:04,633 --> 11:09:07,091 I can throw away, essentially,\nthe old array at this point. 11456 11:09:07,091 --> 11:09:09,101 Because I have it now\nentirely in duplicate. 11457 11:09:09,101 --> 11:09:11,241 And I can populate it with the number 4. 11458 11:09:12,612 --> 11:09:15,582 That is a correct potential\nsolution to this problem. 11459 11:09:16,665 --> 11:09:19,624 And this is something we're going to\n 11460 11:09:19,624 --> 11:09:22,302 What's the downside of having\nsolved this problem in this way? 11461 11:09:23,896 --> 11:09:25,271 I'm adding a lot of running time. 11462 11:09:25,271 --> 11:09:28,061 It took me a lot of effort to\ncopy those additional numbers. 11463 11:09:28,061 --> 11:09:29,502 Now, granted, it's a small array. 11464 11:09:30,502 --> 11:09:32,377 It's going to be over\nin the blink of an eye. 11465 11:09:32,377 --> 11:09:35,061 But if we start talking\nabout interesting data sets 11466 11:09:35,061 --> 11:09:37,671 web application data sets,\nmobile app data sets. 11467 11:09:37,671 --> 11:09:41,152 Where you have not just a few, but\n 11468 11:09:41,152 --> 11:09:43,112 a few million pieces of data. 11469 11:09:43,112 --> 11:09:46,252 This is probably a suboptimal\nsolution to just, oh 11470 11:09:46,252 --> 11:09:48,234 move all your data from\none place to another. 11471 11:09:48,233 --> 11:09:49,941 Because who's to say\nthat we're not going 11472 11:09:49,942 --> 11:09:51,531 to paint ourselves into a new corner. 11473 11:09:51,531 --> 11:09:54,741 And it would feel like you're wasting\n 11474 11:09:54,741 --> 11:09:58,591 And, ultimately, just costing\nyourself a huge amount of time. 11475 11:09:58,591 --> 11:10:01,611 In fact, if we put this now into\n 11476 11:10:01,612 --> 11:10:06,531 from a few weeks back, what might\nthe running time now of Search 11477 11:10:08,752 --> 11:10:10,912 A throwback a couple of weeks ago. 11478 11:10:10,911 --> 11:10:14,061 If you're using an array, to\nrecap, what was the running time 11479 11:10:14,061 --> 11:10:17,072 of a Search algorithm in Big O notation? 11480 11:10:17,072 --> 11:10:19,252 So, maybe, in the worst case. 11481 11:10:19,252 --> 11:10:23,031 If you've got n numbers, 3 in this\n 11482 11:10:28,582 --> 11:10:30,202 And what's your intuition for that? 11483 11:10:36,792 --> 11:10:39,584 So if we go through each element,\n 11484 11:10:39,584 --> 11:10:42,972 then Search is going to take\nthis a Big O running time. 11485 11:10:42,972 --> 11:10:46,002 If, though, we're talking about\nthese numbers, specifically. 11486 11:10:46,002 --> 11:10:48,972 And now I'll explicitly stipulate\nthat, yeah, they're sorted. 11487 11:10:50,141 --> 11:10:54,432 What would the Big O notation be\n 11488 11:10:54,432 --> 11:10:56,921 be it of size 3, or 4,\nor n, more generally. 11489 11:10:57,972 --> 11:10:59,772 SPEAKER 1: Big O of, not n, but rather? 11490 11:11:01,182 --> 11:11:05,190 Because we could use per week zero\n 11491 11:11:05,190 --> 11:11:06,732 we'd have to deal with some rounding. 11492 11:11:06,732 --> 11:11:08,922 Because there's not a perfect\nnumber of elements at the moment. 11493 11:11:08,921 --> 11:11:10,332 But you could use binary search. 11494 11:11:11,652 --> 11:11:13,391 And then go left or\nright, left or right 11495 11:11:13,391 --> 11:11:15,141 until you find the\nelement you care about. 11496 11:11:15,141 --> 11:11:19,302 So Search remains in Big O\nof log n when using arrays. 11497 11:11:19,302 --> 11:11:21,132 But what about insertion, now? 11498 11:11:21,131 --> 11:11:23,171 If we start to think\nabout other operations. 11499 11:11:23,171 --> 11:11:26,862 Like, adding a number to this array,\n 11500 11:11:26,862 --> 11:11:29,531 app, or Google finding\nanother page on the internet. 11501 11:11:29,531 --> 11:11:31,991 So insertion happens all the time. 11502 11:11:31,991 --> 11:11:34,811 What's the running time of Insert? 11503 11:11:34,811 --> 11:11:38,112 When it comes to inserting into\nan existing array of size n. 11504 11:11:38,112 --> 11:11:40,781 How many steps might that take? 11505 11:11:43,201 --> 11:11:46,061 Because in the worst case,\nwhere you're out of space 11506 11:11:46,061 --> 11:11:48,629 you have to allocate, it\nwould seem, a new array. 11507 11:11:48,629 --> 11:11:50,921 Maybe, taking over some of\nthe previous garbage values. 11508 11:11:50,921 --> 11:11:52,661 But the catch is, even\nthough you're only 11509 11:11:52,661 --> 11:11:55,031 inserting one new number,\nlike the number 4 11510 11:11:55,031 --> 11:11:58,551 you have to copy over all the darn\n 11511 11:11:58,552 --> 11:12:01,542 So if your original array of\nsize n, the copying of that 11512 11:12:01,542 --> 11:12:03,412 is going to take Big O of n plus 1. 11513 11:12:03,411 --> 11:12:06,411 But we can throw away the plus 1\n 11514 11:12:06,411 --> 11:12:09,341 So Insert now becomes Big O of n. 11515 11:12:09,341 --> 11:12:11,201 And that might not be ideal. 11516 11:12:11,201 --> 11:12:13,991 Because if you're in the habit\nof inserting things frequently 11517 11:12:13,991 --> 11:12:16,362 that could start to add\nup, and add up, and add up. 11518 11:12:16,362 --> 11:12:19,302 And this is why computer programs,\nand websites, and mobile apps 11519 11:12:20,472 --> 11:12:23,482 If you're not being mindful\nof these trade offs. 11520 11:12:23,482 --> 11:12:27,492 So what about, just for good\nmeasure, Omega notation. 11521 11:12:28,752 --> 11:12:31,241 Well just to recap\nhere, we could get lucky 11522 11:12:31,241 --> 11:12:33,533 and Search could just take one step. 11523 11:12:33,533 --> 11:12:35,741 Because you might just get\nlucky, and boom the number 11524 11:12:35,741 --> 11:12:38,292 you're looking for is right there in\n 11525 11:12:38,292 --> 11:12:40,152 Or even linear search, for that matter. 11526 11:12:41,201 --> 11:12:45,191 If there's enough room, and we didn't\n 11527 11:12:45,192 --> 11:12:46,728 1, 2, and 3, to a new location. 11528 11:12:47,561 --> 11:12:49,722 And we could have, as\nsomeone suggested, just 11529 11:12:49,722 --> 11:12:51,520 put the number 4 right there at the end. 11530 11:12:51,519 --> 11:12:53,561 And if we don't get lucky,\nit might take n steps. 11531 11:12:53,561 --> 11:12:57,441 If we do get lucky, it might just take\n 11532 11:12:57,442 --> 11:12:59,152 In fact, let me go ahead and do this. 11533 11:12:59,152 --> 11:13:00,802 How about we do something like this? 11534 11:13:00,802 --> 11:13:02,502 Let me switch over to some code here. 11535 11:13:02,502 --> 11:13:05,591 Let me start to make a\nprogram called List.C. 11536 11:13:05,591 --> 11:13:08,270 And in List.C, let's\nstart with the old way. 11537 11:13:08,271 --> 11:13:11,512 So we follow the breadcrumbs we've\n 11538 11:13:11,512 --> 11:13:14,952 So in this List.C, I'm going\nto include standardio.h. 11539 11:13:16,932 --> 11:13:20,262 Then inside of my code here, I'm\n 11540 11:13:20,262 --> 11:13:22,072 the first version of memory. 11541 11:13:22,072 --> 11:13:26,811 So int list 3 is now implemented\nat the moment, in an array. 11542 11:13:26,811 --> 11:13:29,169 So we're rewinding for\nnow to week 2 style code. 11543 11:13:29,169 --> 11:13:31,002 And then, let me just\ninitialize this thing. 11544 11:13:31,002 --> 11:13:32,682 At the first location will be 1. 11545 11:13:32,682 --> 11:13:34,722 At the next location will be 2. 11546 11:13:34,722 --> 11:13:37,391 And at the last location will be 3. 11547 11:13:37,391 --> 11:13:39,722 So the array is zero indexed always. 11548 11:13:39,722 --> 11:13:41,472 I, for just the sake\nof discussion though 11549 11:13:41,472 --> 11:13:44,902 am putting in the numbers 1, 2,\n3, like a normal person might. 11550 11:13:45,402 --> 11:13:46,819 So now let's just print these out. 11551 11:13:50,322 --> 11:13:53,232 Let's go ahead now and\nprint out using printf. 11552 11:13:56,141 --> 11:13:59,771 So very simple program, inspired\nby what we did in week 2. 11553 11:13:59,771 --> 11:14:03,682 Just to create and then print\nout the contents of an array. 11554 11:14:05,862 --> 11:14:09,942 So far, so good. ./list\nAnd voila, we see 1, 2, 3. 11555 11:14:09,942 --> 11:14:14,952 Now let's start to practice some of what\n 11556 11:14:14,951 --> 11:14:19,541 So let me go in now and get\nrid of the array version. 11557 11:14:19,542 --> 11:14:22,391 And let me zoom out a little bit\n 11558 11:14:22,391 --> 11:14:25,932 And now let's begin to\ncreate a list of size 3. 11559 11:14:25,932 --> 11:14:29,112 So if I'm going to do\nthis now, dynamically 11560 11:14:29,112 --> 11:14:33,262 so that I'm allocating these\nthings again and again 11561 11:14:33,262 --> 11:14:34,912 let me go ahead and do this. 11562 11:14:34,911 --> 11:14:41,951 Let me give myself a list that's of type\n 11563 11:14:41,951 --> 11:14:48,971 of 3 times the size of an int, so what\n 11564 11:14:48,972 --> 11:14:51,972 enough memory for that very first\npicture we drew on the board. 11565 11:14:51,972 --> 11:14:54,641 Which was the array\ncontaining 1, 2, and 3. 11566 11:14:54,641 --> 11:14:57,472 But laying the foundation\nto be able to resize it 11567 11:14:57,472 --> 11:14:59,061 which was ultimately the goal. 11568 11:14:59,061 --> 11:15:01,131 So my syntax is a little different here. 11569 11:15:01,131 --> 11:15:04,572 I'm going to use malloc and get memory\n 11570 11:15:05,482 --> 11:15:09,372 Instead of using the stack by just\n 11571 11:15:12,161 --> 11:15:16,572 That is to say this line of code from\n 11572 11:15:16,572 --> 11:15:20,112 identical to this line of\ncode in the second version. 11573 11:15:20,112 --> 11:15:22,211 But the first line of\ncode puts the memory 11574 11:15:22,211 --> 11:15:24,372 on the stack, automatically, for me. 11575 11:15:24,372 --> 11:15:27,282 The second line of code,\nthat I've left here now 11576 11:15:27,281 --> 11:15:30,762 is creating an array of size 3,\nbut it's putting it on the heap. 11577 11:15:30,762 --> 11:15:34,382 And that's important because it was only\n 11578 11:15:35,311 --> 11:15:38,341 That you can actually ask for more\n 11579 11:15:38,341 --> 11:15:42,241 When you just use the\nfirst notation int list 3 11580 11:15:42,241 --> 11:15:45,631 you have permanently given\nyourself an array of size 3. 11581 11:15:45,631 --> 11:15:48,612 You cannot add to that in code. 11582 11:15:48,612 --> 11:15:50,491 So let me go ahead and do this. 11583 11:15:50,491 --> 11:15:53,625 If list==null, something went wrong. 11584 11:15:53,625 --> 11:15:54,792 The computers out of memory. 11585 11:15:54,792 --> 11:15:56,985 So let's just return 1 and\nquit out of this program. 11586 11:15:56,985 --> 11:15:58,152 There's nothing to see here. 11587 11:15:58,152 --> 11:16:00,002 So just a good error check there. 11588 11:16:00,002 --> 11:16:02,252 Now let me go ahead and\ninitialize this list. 11589 11:16:02,252 --> 11:16:04,201 So list [0] will be 1 again. 11590 11:16:07,921 --> 11:16:10,292 So that's the same kind\nof syntax as before. 11591 11:16:10,292 --> 11:16:13,412 And notice this equivalence. 11592 11:16:13,411 --> 11:16:18,211 Recall that there's this relationship\n 11593 11:16:18,211 --> 11:16:21,031 And arrays are really just doing\npointer arithmetic for you 11594 11:16:21,031 --> 11:16:22,741 where the square bracket notation is. 11595 11:16:22,741 --> 11:16:27,511 So if I've asked myself here, in line\n 11596 11:16:27,512 --> 11:16:32,732 it is perfectly OK to treat it now like\n 11597 11:16:32,732 --> 11:16:35,222 Because the computer will\ndo the arithmetic for me 11598 11:16:35,222 --> 11:16:37,921 and find the first location,\nthe second, and the third. 11599 11:16:37,921 --> 11:16:42,031 If you really want to be\ncool and hacker-like, well 11600 11:16:42,031 --> 11:16:48,781 you could say list=1,\nlist+1=2, list+2=3. 11601 11:16:51,362 --> 11:16:53,701 That's the same thing\nusing very explicit 11602 11:16:53,701 --> 11:16:56,311 pointer arithmetic, which we\nlooked at briefly last week. 11603 11:16:56,311 --> 11:16:58,651 But this is atrocious to\nlook at for most people. 11604 11:16:58,652 --> 11:17:00,342 It's just not very user friendly. 11605 11:17:00,341 --> 11:17:03,271 It's longer to type, so\nmost people, even when 11606 11:17:03,271 --> 11:17:06,151 allocating memory dynamically\nas I did a second ago 11607 11:17:06,152 --> 11:17:10,112 would just use the more\nfamiliar notation of an array. 11608 11:17:11,792 --> 11:17:16,322 Now suppose time passes\nand I realize, oh shoot 11609 11:17:16,322 --> 11:17:21,302 I really wanted this array to\nbe of size 4 instead of size 3. 11610 11:17:21,302 --> 11:17:23,844 Now, obviously, I could just\nrewind and like fix the program. 11611 11:17:23,843 --> 11:17:25,801 But suppose that this is\na much larger program. 11612 11:17:25,802 --> 11:17:28,172 And I've realized, at\nthis point, that I need 11613 11:17:28,171 --> 11:17:31,561 to be able to dynamically add more\n 11614 11:17:32,222 --> 11:17:33,762 Well let me go ahead and do this. 11615 11:17:33,762 --> 11:17:36,152 Let me just say, all\nright, list should actually 11616 11:17:36,152 --> 11:17:42,182 be the result of asking for 4\nchunks of memory from malloc. 11617 11:17:42,182 --> 11:17:46,217 And then, I could do something\nlike this, list [3]=4. 11618 11:17:49,171 --> 11:17:52,182 Now this is buggy, potentially,\nin a couple of ways. 11619 11:17:52,182 --> 11:17:59,012 But let me ask first, what's really\n 11620 11:17:59,012 --> 11:18:03,332 The goal at hand is to start with\n 11621 11:18:03,332 --> 11:18:05,141 And I want to add a number 4 to it. 11622 11:18:05,141 --> 11:18:10,862 So at the moment, in line 17, I've asked\n 11623 11:18:12,421 --> 11:18:14,612 And then I'm adding the number 4 to it. 11624 11:18:14,612 --> 11:18:18,092 But I have skipped a few\nsteps and broken this somehow. 11625 11:18:19,375 --> 11:18:21,504 AUDIENCE: You don't know\nexactly [INAUDIBLE].. 11626 11:18:22,171 --> 11:18:24,542 I don't necessarily know where\n 11627 11:18:24,542 --> 11:18:26,042 It's probably not\ngoing to be immediately 11628 11:18:26,042 --> 11:18:27,391 adjacent to the previous chunk. 11629 11:18:27,391 --> 11:18:30,222 And so, yes, even though I'm\nputting the number for there 11630 11:18:30,222 --> 11:18:34,182 I haven't copied the 1, the 2, or\n 11631 11:18:35,881 --> 11:18:40,112 well, that's actually, indeed,\n 11632 11:18:40,112 --> 11:18:43,561 I am orphaning the\noriginal chunk of memory. 11633 11:18:43,561 --> 11:18:46,741 If you think of the picture that\n 11634 11:18:46,741 --> 11:18:52,981 up here on line 5 that allocates\n 11635 11:18:55,752 --> 11:18:59,131 But as soon as I do this, I'm\nclobbering the value of list. 11636 11:18:59,131 --> 11:19:01,441 And saying no, don't point\nat this chunk of memory. 11637 11:19:01,442 --> 11:19:05,382 Point at this chunk of memory, at\n 11638 11:19:05,381 --> 11:19:07,711 where the original chunk of memory is. 11639 11:19:07,711 --> 11:19:12,301 So the right way to do something like\n 11640 11:19:12,302 --> 11:19:14,880 Let me go ahead and give\nmyself a temporary variable. 11641 11:19:14,879 --> 11:19:16,171 And I'll literally call it TMP. 11642 11:19:16,171 --> 11:19:18,301 T-M-P, like I did last week. 11643 11:19:18,302 --> 11:19:21,602 So that I can now ask the computer for\n 11644 11:19:22,771 --> 11:19:25,711 I'm going to again say\nif TMP equals null 11645 11:19:25,711 --> 11:19:27,851 I'm going to say bad\nthings happened here. 11646 11:19:29,042 --> 11:19:31,322 And you know what,\njust to be tidy, let me 11647 11:19:31,322 --> 11:19:34,023 free the original list before I quit. 11648 11:19:34,023 --> 11:19:35,731 Because remember from\nlast week, any time 11649 11:19:35,732 --> 11:19:38,132 you use malloc you\neventually have to use free. 11650 11:19:38,131 --> 11:19:41,521 But this chunk of code here\nis just a safety check. 11651 11:19:41,521 --> 11:19:43,921 If there's no more memory,\nthere's nothing to see here. 11652 11:19:43,921 --> 11:19:46,981 I'm just going to clean\nup my state and quit. 11653 11:19:46,982 --> 11:19:50,322 But now, if I have asked\nfor this chunk of memory 11654 11:19:50,322 --> 11:19:55,682 now I can do this 4 int i gets 0. 11655 11:19:58,082 --> 11:20:00,002 What if I do something like this? 11656 11:20:04,021 --> 11:20:08,461 That would seem to have the effect\n 11657 11:20:09,281 --> 11:20:12,991 And then, I think I need to\ndo one last thing TMP [3] 11658 11:20:12,991 --> 11:20:14,941 gets the number 4, for instance. 11659 11:20:14,942 --> 11:20:18,961 Again, I'm hard coding the numbers\nfor the sake of discussion. 11660 11:20:18,961 --> 11:20:23,942 After I've done this,\nwhat could I now do? 11661 11:20:23,942 --> 11:20:28,472 I could now set list equals to TMP. 11662 11:20:28,472 --> 11:20:31,529 And now, I have updated\nmy linked list properly. 11663 11:20:31,529 --> 11:20:32,822 So let me go ahead and do this. 11664 11:20:36,961 --> 11:20:42,301 Let me go ahead and print each of these\n 11665 11:20:42,302 --> 11:20:45,372 And then, I'm going to return 0 just\n 11666 11:20:45,372 --> 11:20:49,472 Now so to recap, we\ninitialize the original array 11667 11:20:49,472 --> 11:20:52,622 of size 3 and plug-in\nthe values 1, 2, 3. 11668 11:20:53,442 --> 11:20:55,692 And then, I realize, wait a\nminute, I need more space. 11669 11:20:55,692 --> 11:20:58,067 And so I asked the computer\nfor a second chunk of memory. 11670 11:20:59,281 --> 11:21:01,949 Just as a safety check, I make\nsure that TMP doesn't equal null. 11671 11:21:01,949 --> 11:21:03,489 Because if it does I'm out of memory. 11672 11:21:03,489 --> 11:21:05,072 So I should just quit altogether. 11673 11:21:05,072 --> 11:21:07,591 But once I'm sure that\nit's not null, I'm 11674 11:21:07,591 --> 11:21:12,931 going to copy all the values from\n 11675 11:21:12,932 --> 11:21:16,391 And then, I'm going to add my new\n 11676 11:21:16,391 --> 11:21:19,891 And then, now that I'm done playing\n 11677 11:21:19,891 --> 11:21:23,341 I'm going to remember\nin my list variable what 11678 11:21:23,341 --> 11:21:25,381 the addresses of this\nnew chunk of memory. 11679 11:21:25,381 --> 11:21:28,051 And then, I'm going to print\nall of those values out. 11680 11:21:28,052 --> 11:21:31,832 So at least, aesthetically, when I\n 11681 11:21:31,832 --> 11:21:34,141 except for my missing semicolon. 11682 11:21:38,101 --> 11:21:40,771 Implicitly declaring a\nlibrary function malloc. 11683 11:21:40,771 --> 11:21:45,230 What's my mistake any time\nyou see that kind of error? 11684 11:21:46,862 --> 11:21:52,182 So up here, I forgot to do include\n 11685 11:21:52,182 --> 11:21:53,972 Let me go ahead and,\nagain, do make list. 11686 11:21:56,432 --> 11:21:59,311 And I should see 1, 2, 3, 4. 11687 11:21:59,311 --> 11:22:03,122 But they're still a bug here. 11688 11:22:03,122 --> 11:22:05,792 Does anyone see the\nthe-- bug or question? 11689 11:22:05,792 --> 11:22:07,582 AUDIENCE: You forgot to free them. 11690 11:22:07,582 --> 11:22:08,272 SPEAKER 1: I'm sorry, say again. 11691 11:22:08,271 --> 11:22:09,951 AUDIENCE: You forgot to free them. 11692 11:22:09,951 --> 11:22:12,051 SPEAKER 1: I forgot to\nfree the original list. 11693 11:22:12,052 --> 11:22:15,652 And we could see this, even if not\n 11694 11:22:15,652 --> 11:22:18,329 If I do something like\nValgrind of dot/list 11695 11:22:18,328 --> 11:22:19,911 remember our tool from this past week. 11696 11:22:19,911 --> 11:22:22,791 Let me increase the size of my\nterminal window, temporarily. 11697 11:22:22,792 --> 11:22:25,022 The output is crazy cryptic at first. 11698 11:22:25,021 --> 11:22:30,261 But, notice that I have definitely\n 11699 11:22:30,262 --> 11:22:32,632 And indeed, it's even\npointing at the line number 11700 11:22:32,631 --> 11:22:34,411 in which some of those bytes were lost. 11701 11:22:34,411 --> 11:22:36,411 So let me go ahead and back to my code. 11702 11:22:36,411 --> 11:22:41,091 And indeed, I think what I need to do\n 11703 11:22:41,091 --> 11:22:44,631 pointing it at this new chunk\nof memory instead of the old 11704 11:22:44,631 --> 11:22:47,391 I think I now need to\nfirst, proactively 11705 11:22:47,391 --> 11:22:49,942 say free the old list of memory. 11706 11:22:51,961 --> 11:22:56,731 So if I now do Make List and do dot\n 11707 11:22:56,732 --> 11:22:59,932 And, if I cross my fingers\nand run Valgrind again 11708 11:22:59,932 --> 11:23:03,921 after increasing my window\nsize, hopefully here. 11709 11:23:06,561 --> 11:23:09,502 It seems like less memory is lost. 11710 11:23:09,502 --> 11:23:11,932 What have I now forgotten to do? 11711 11:23:11,932 --> 11:23:13,912 AUDIENCE: You forgot to free the end. 11712 11:23:13,911 --> 11:23:16,221 SPEAKER 1: I forgot to free\nit at the very end, too. 11713 11:23:16,222 --> 11:23:19,042 Because I still have a chunk of\nmemory that I got from malloc. 11714 11:23:19,042 --> 11:23:21,682 So let me go to the very\nbottom of the program now. 11715 11:23:21,682 --> 11:23:26,811 And after I'm done senselessly\njust printing this thing out 11716 11:23:29,932 --> 11:23:33,262 And now let me do Make List, dot/list. 11717 11:23:35,152 --> 11:23:39,682 Now let's do Valgrind\nof dot/list, Enter. 11718 11:23:39,682 --> 11:23:43,012 And now, hopefully, all\nheap blocks were freed. 11719 11:23:44,500 --> 11:23:47,542 So this is perhaps the best output\n 11720 11:23:47,542 --> 11:23:50,432 I used the heap, but I freed\nall the memory as well. 11721 11:23:50,432 --> 11:23:52,112 So there were 2 fixes needed there. 11722 11:23:52,612 --> 11:23:56,391 Any questions then on this array-based\n 11723 11:23:56,391 --> 11:23:59,012 is statically allocating\nan array, so to speak. 11724 11:23:59,012 --> 11:24:00,711 By just hard coding the number 3. 11725 11:24:00,711 --> 11:24:04,671 The second version now is\ndynamically allocating the array 11726 11:24:04,671 --> 11:24:06,862 using not the stack but the heap. 11727 11:24:06,862 --> 11:24:10,281 But, it too, suffers from the\nslowness we described earlier 11728 11:24:10,281 --> 11:24:12,771 of having to copy all those\nvalues from one to the other. 11729 11:24:14,665 --> 11:24:17,340 AUDIENCE: Why do you not\nhave to free the TMP? 11730 11:24:18,381 --> 11:24:20,301 Why did I not have to free the TMP? 11731 11:24:20,302 --> 11:24:22,612 I essentially did eventually. 11732 11:24:22,612 --> 11:24:27,842 Because TMP was pointing\nat the chunk of 4 integers. 11733 11:24:27,841 --> 11:24:33,291 But on line 33 here,\nI assigned list to be 11734 11:24:33,292 --> 11:24:36,061 identical to what TMP was pointing at. 11735 11:24:36,061 --> 11:24:40,654 And so, when I finally freed the list,\n 11736 11:24:40,654 --> 11:24:43,822 In fact, if I wanted to, I could say\n 11737 11:24:43,822 --> 11:24:45,561 But conceptually, it's wrong. 11738 11:24:45,561 --> 11:24:49,612 Because at this point in the story, I\n 11739 11:24:50,722 --> 11:24:52,822 But they were the same at\nthat point in the story. 11740 11:24:53,322 --> 11:24:55,360 AUDIENCE: Is [? the line ?] part of it? 11741 11:24:56,402 --> 11:24:58,832 And long story short,\neverything we're doing thus far 11742 11:24:58,832 --> 11:25:00,302 is still in the world of arrays. 11743 11:25:00,302 --> 11:25:02,192 The only distinction\nwe're making is that 11744 11:25:02,192 --> 11:25:08,702 in version 1, when I said int list\n 11745 11:25:08,701 --> 11:25:12,631 So-called statically allocated\non the stack, as per last week. 11746 11:25:12,631 --> 11:25:16,381 This version now is still dealing with\n 11747 11:25:16,381 --> 11:25:18,461 and using dynamic memory allocation. 11748 11:25:18,461 --> 11:25:20,979 So that I can still use an\narray per the first pictures 11749 11:25:22,021 --> 11:25:24,551 But I can at least grow\nthe array if I want. 11750 11:25:24,552 --> 11:25:28,472 So we haven't even now solved this, even\n 11751 11:25:30,061 --> 11:25:34,411 AUDIENCE: How are you able to free\n 11752 11:25:34,411 --> 11:25:37,201 SPEAKER 1: How am I able to free list? 11753 11:25:37,201 --> 11:25:41,791 I freed the original address of list. 11754 11:25:41,792 --> 11:25:44,702 I, then, changed what list is storing. 11755 11:25:44,701 --> 11:25:47,551 I'm moving its arrow to\na new chunk of memory. 11756 11:25:47,552 --> 11:25:51,032 And that is perfectly reasonable\nfor me to now manipulate 11757 11:25:51,031 --> 11:25:54,661 because now list is pointing\nat the same value of TMP. 11758 11:25:54,661 --> 11:26:00,091 And TMP is what was given the return\n 11759 11:26:00,091 --> 11:26:02,261 So that chunk of memory is valid. 11760 11:26:02,262 --> 11:26:05,702 So these are just squares\non the board, right. 11761 11:26:05,701 --> 11:26:07,451 There's just pointers inside of them. 11762 11:26:07,451 --> 11:26:09,368 So what I'm technically\nsaying is, and I'm not 11763 11:26:09,368 --> 11:26:11,521 pointing I'm not freeing\nlist per se, I am 11764 11:26:11,521 --> 11:26:16,141 freeing the chunk of memory that begins\n 11765 11:26:16,141 --> 11:26:21,542 Therefore, if a few lines later, I\n 11766 11:26:21,542 --> 11:26:25,561 Totally reasonable to then touch that\n 11767 11:26:25,561 --> 11:26:27,871 Because you're not freeing\nthe variable per se 11768 11:26:27,872 --> 11:26:30,272 you're freeing the\naddress in the variable. 11769 11:26:31,622 --> 11:26:37,232 So let me back up here and\nnow make one final edit. 11770 11:26:37,232 --> 11:26:41,672 So let's finish this with\none final improvement here. 11771 11:26:41,671 --> 11:26:44,641 Because it turns out,\nthere's a somewhat better way 11772 11:26:44,641 --> 11:26:48,091 to actually resize an array\nas we've been doing here. 11773 11:26:48,091 --> 11:26:52,509 And there's another function in stdlib\n 11774 11:26:52,510 --> 11:26:55,052 And I'm just going to go in and\nmake a little bit of a change 11775 11:26:55,052 --> 11:26:58,060 here so that I can do the following. 11776 11:26:58,059 --> 11:26:59,851 Let me go ahead and\nfirst comment this now 11777 11:26:59,851 --> 11:27:02,801 just so we can keep track of what's\n 11778 11:27:02,802 --> 11:27:09,452 So dynamically allocate\nan array of size 3. 11779 11:27:09,451 --> 11:27:14,131 Assign 3 numbers to that array. 11780 11:27:15,811 --> 11:27:21,121 Allocate new array of size 4. 11781 11:27:21,122 --> 11:27:26,942 Copy numbers from old\narray into new array. 11782 11:27:26,942 --> 11:27:31,652 And add fourth number to new array. 11783 11:27:36,332 --> 11:27:41,942 Remember, if you will, new array\nusing my same list variable. 11784 11:27:49,741 --> 11:27:53,011 And we'll post this code online after\n 11785 11:27:53,012 --> 11:27:56,702 So it turns out that we can reduce\n 11786 11:27:56,701 --> 11:27:59,461 Not so much with the printing\nhere, but with this copying. 11787 11:27:59,461 --> 11:28:01,741 Turns out c does have a\nfunction called realloc 11788 11:28:01,741 --> 11:28:07,061 that can actually handle the resizing\n 11789 11:28:07,061 --> 11:28:09,182 I'm going to scroll up\nto where I previously 11790 11:28:09,182 --> 11:28:12,302 allocated a new array of size 4. 11791 11:28:12,302 --> 11:28:19,502 And I'm instead going to say this,\n 11792 11:28:19,502 --> 11:28:21,959 Now, previously this wasn't\nnecessarily possible. 11793 11:28:21,959 --> 11:28:23,792 Because recall that we\nhad painted ourselves 11794 11:28:23,792 --> 11:28:25,625 into a corner with the\nexample on the screen 11795 11:28:25,625 --> 11:28:28,472 where "Hello, world" happened to\n 11796 11:28:29,891 --> 11:28:32,822 Let me use realloc, for re-allocate. 11797 11:28:32,822 --> 11:28:36,122 And pass in not just the size\nof memory we want this time 11798 11:28:36,122 --> 11:28:39,812 but also the address\nthat we want to resize. 11799 11:28:39,811 --> 11:28:43,421 Which, again, is this array called list. 11800 11:28:43,921 --> 11:28:46,811 The code thereafter is\npretty much the same. 11801 11:28:46,811 --> 11:28:50,682 But what I don't need to do is this. 11802 11:28:50,682 --> 11:28:54,002 So realloc is a pretty handy\n 11803 11:28:54,002 --> 11:28:57,152 If at the very beginning of class,\n 11804 11:28:57,152 --> 11:29:00,491 And someone's instinct was to just plop\n 11805 11:29:00,491 --> 11:29:03,241 If there's available memory,\nrealloc will just do that. 11806 11:29:03,241 --> 11:29:07,682 And boom, it will just grow the array\n 11807 11:29:07,682 --> 11:29:11,641 If, though, it realizes, sorry, there's\n 11808 11:29:11,641 --> 11:29:14,521 or something else there,\nrealloc will handle 11809 11:29:14,521 --> 11:29:18,211 the trouble of moving that whole\narray from 1 chunk of memory 11810 11:29:18,211 --> 11:29:20,491 originally, to a new chunk of memory. 11811 11:29:20,491 --> 11:29:26,881 And then realloc will return to you,\n 11812 11:29:26,881 --> 11:29:31,031 And it will handle the process\nof freeing the old chunk for you. 11813 11:29:31,031 --> 11:29:33,281 So you do not need to do this yourself. 11814 11:29:33,281 --> 11:29:36,612 So in fact, let me go ahead\nand get rid of this as well. 11815 11:29:36,612 --> 11:29:41,582 So realloc just condenses, a lot of what\n 11816 11:29:41,582 --> 11:29:45,592 Whereby, realloc handles it for you. 11817 11:29:46,091 --> 11:29:49,151 So that's the final improvement\non this array-based approach. 11818 11:29:49,152 --> 11:29:51,932 So what now, knowing\nwhat your memory is 11819 11:29:51,932 --> 11:29:54,881 what can we now do with it that\nsolves that kind of problem? 11820 11:29:54,881 --> 11:29:56,801 Because the world is\ngoing to get really slow. 11821 11:29:56,802 --> 11:29:59,802 And our apps, and our phones, and our\n 11822 11:29:59,802 --> 11:30:04,032 if we're just constantly wasting\n 11823 11:30:04,031 --> 11:30:05,891 What could we perhaps do instead? 11824 11:30:05,891 --> 11:30:07,961 Well there's one new\npiece of syntax today 11825 11:30:07,961 --> 11:30:11,322 that builds on these 3 pieces\nof syntax from the past. 11826 11:30:11,322 --> 11:30:13,182 Recall, that we've\nlooked at struct, which 11827 11:30:13,182 --> 11:30:16,302 is a keyword in C, that just lets\nyou invent your own structure. 11828 11:30:16,302 --> 11:30:19,542 Your own variable, if you will,\nin conjunction with typedef. 11829 11:30:19,542 --> 11:30:23,682 Which lets you say a person has a name\n 11830 11:30:23,682 --> 11:30:26,141 Or a candidate has a name\nand some number of votes. 11831 11:30:26,141 --> 11:30:30,521 You can encapsulate multiple pieces of\n 11832 11:30:30,521 --> 11:30:34,641 What did we use the Dot Notation\nfor now, a couple of times? 11833 11:30:34,641 --> 11:30:37,949 What does the Dot operator do in C? 11834 11:30:37,949 --> 11:30:39,241 AUDIENCE: Access the structure. 11835 11:30:39,631 --> 11:30:41,682 To access the field\ninside of a structure. 11836 11:30:41,682 --> 11:30:43,807 So if you've got a person\nwith a name and a number 11837 11:30:43,807 --> 11:30:46,832 you could say something like\nperson.name or person.number 11838 11:30:46,832 --> 11:30:48,992 if person is the name\nof one such variable. 11839 11:30:48,991 --> 11:30:51,331 Star, of course, we've\nseen now in a few ways. 11840 11:30:51,332 --> 11:30:55,022 Like way back in week 1, we\nsaw it as like, multiplication. 11841 11:30:55,021 --> 11:30:58,231 Last week, we began to see it\nin the context of pointers 11842 11:30:58,232 --> 11:31:00,452 whereby, you use it\nto declare a pointer. 11843 11:31:00,451 --> 11:31:03,041 Like, int* p, or something like that. 11844 11:31:03,042 --> 11:31:05,522 But we also saw it in\none other context, which 11845 11:31:05,521 --> 11:31:08,861 was like the opposite, which\nwas the dereference operator. 11846 11:31:08,862 --> 11:31:10,754 Which says if this is\nan address, that is 11847 11:31:10,754 --> 11:31:13,711 if this is a variable like a pointer,\n 11848 11:31:13,711 --> 11:31:17,461 then with no int or no char,\nno data type in front of it. 11849 11:31:17,461 --> 11:31:19,351 That means go to that address. 11850 11:31:19,351 --> 11:31:22,781 And it dereferences the pointer\nand goes to that location. 11851 11:31:22,781 --> 11:31:25,201 So it turns out that using\nthese 3 building blocks 11852 11:31:25,201 --> 11:31:28,241 you can actually start to now use\n 11853 11:31:28,741 --> 11:31:31,201 And even next week, when\nwe transition to Python 11854 11:31:31,201 --> 11:31:33,841 and you start to get a\nlot of features for free. 11855 11:31:33,841 --> 11:31:36,031 Like a single line of\ncode will just do so much 11856 11:31:36,031 --> 11:31:40,652 more in Python than it does in C. It\n 11857 11:31:40,652 --> 11:31:42,542 And just so you've seen it already. 11858 11:31:42,542 --> 11:31:47,252 It turns out that it's so\ncommon in C to use this operator 11859 11:31:47,252 --> 11:31:51,271 to go inside of a structure and\n 11860 11:31:51,271 --> 11:31:53,731 that there's shorthand\nnotation for it, a.k.a. 11861 11:31:54,932 --> 11:31:56,576 That literally looks like an arrow. 11862 11:31:56,576 --> 11:31:58,951 So recall last week, I was in\nthe habit of pointing, even 11863 11:32:00,152 --> 11:32:04,502 This arrow notation, a\nhyphen and an angled bracket 11864 11:32:04,502 --> 11:32:11,432 denotes going to an address and\nlooking at a field inside of it. 11865 11:32:11,432 --> 11:32:13,722 But we'll see this in\npractice in just a bit. 11866 11:32:13,722 --> 11:32:16,592 So what might be the\nsolution, now, to this problem 11867 11:32:16,591 --> 11:32:20,101 we saw a moment ago whereby, we had\n 11868 11:32:20,101 --> 11:32:23,381 And our memory, a few moments\nago, looked like this. 11869 11:32:23,381 --> 11:32:28,201 We could just copy the whole existing\n 11870 11:32:29,491 --> 11:32:33,331 What would another, perhaps\nbetter solution longer term 11871 11:32:33,332 --> 11:32:38,627 be, that doesn't require\nconstantly moving stuff around? 11872 11:32:38,627 --> 11:32:40,502 Maybe hang in there for\nyour instincts if you 11873 11:32:40,502 --> 11:32:44,682 know the buzz phrase we're looking for\n 11874 11:32:44,682 --> 11:32:47,281 But if we want to avoid\nmoving the 1, 2, and the 3 11875 11:32:47,281 --> 11:32:49,981 but we still want to be able\nto add endless amounts of data. 11876 11:32:51,961 --> 11:32:54,872 So maybe create some kind\nof list using pointers that 11877 11:32:54,872 --> 11:32:56,852 just point at a new location, right. 11878 11:32:56,851 --> 11:32:59,972 In an ideal world, even\nthough this piece of memory 11879 11:32:59,972 --> 11:33:02,912 is being used by this h in\nthe string "Hello, world" 11880 11:33:02,911 --> 11:33:05,461 maybe we could somehow use\na pointer from last week. 11881 11:33:05,461 --> 11:33:09,811 Like an arrow, that says after the\n 11882 11:33:11,521 --> 11:33:15,791 And you just stitch together\nthese integers in memory 11883 11:33:15,792 --> 11:33:17,822 so that each one leads to the next. 11884 11:33:17,822 --> 11:33:21,182 It's not necessarily the case\nthat it's literally back-to-back. 11885 11:33:21,182 --> 11:33:23,432 That would have the\ndownside, it would seem 11886 11:33:23,432 --> 11:33:24,991 of costing us a little bit of space. 11887 11:33:24,991 --> 11:33:27,601 Like a pointer, which recall,\ntakes up some amount of space. 11888 11:33:27,601 --> 11:33:29,881 Typically 8 bytes or 64 bits. 11889 11:33:29,881 --> 11:33:33,481 But I don't have to copy potentially\na huge amount of data just 11890 11:33:34,921 --> 11:33:36,760 And so these things do have a name. 11891 11:33:36,760 --> 11:33:38,552 And indeed, these things\nare what generally 11892 11:33:38,552 --> 11:33:42,302 would be called a linked list. 11893 11:33:42,302 --> 11:33:44,822 A linked list captures\nexactly that intuition 11894 11:33:44,822 --> 11:33:46,542 of linking together things in memory. 11895 11:33:46,542 --> 11:33:48,012 So let's take a look at an example. 11896 11:33:48,012 --> 11:33:49,804 Here's a computer's\nmemory in the abstract. 11897 11:33:49,803 --> 11:33:52,621 Suppose that I'm trying\nto create an array. 11898 11:33:52,622 --> 11:33:55,682 Let's generalize it as\na list, now, of numbers. 11899 11:33:55,682 --> 11:33:57,362 An array has a very specific meaning. 11900 11:33:57,362 --> 11:34:00,092 It's memory that's contiguous,\nback, to back, to back. 11901 11:34:00,091 --> 11:34:03,721 At the end of the day, I as the\n 11902 11:34:05,822 --> 11:34:09,781 I don't really care how it's stored. 11903 11:34:09,781 --> 11:34:12,091 I don't care how it's stored\nwhen I'm writing the code 11904 11:34:12,091 --> 11:34:13,924 I just wanted to work\nat the end of the day. 11905 11:34:13,925 --> 11:34:16,052 So suppose that I first\ninsert my number 1. 11906 11:34:16,052 --> 11:34:19,592 And, who knows, it ends up,\nup there at location, 0X123 11907 11:34:21,302 --> 11:34:23,552 Maybe there's something already here. 11908 11:34:23,552 --> 11:34:25,592 And heck, maybe there's\nsomething already here 11909 11:34:25,591 --> 11:34:28,576 but there's plenty of other options\nfor where this thing can go. 11910 11:34:28,576 --> 11:34:30,451 And suppose that, for\nthe sake of discussion 11911 11:34:30,451 --> 11:34:32,284 the first available\nspot for the next number 11912 11:34:32,285 --> 11:34:38,094 happens to be over here at location\n 11913 11:34:38,093 --> 11:34:40,051 So that's where I'm going\nto plop the number 2. 11914 11:34:40,052 --> 11:34:41,552 And where might the number 3 end up? 11915 11:34:41,552 --> 11:34:44,342 Oh I don't know, maybe\ndown over there at 0X789. 11916 11:34:44,341 --> 11:34:48,511 The point being, I don't know\nwhat is, or really care about 11917 11:34:48,512 --> 11:34:50,672 everything else that's\nin the computer's memory. 11918 11:34:50,671 --> 11:34:54,722 I just care that there are at\nleast 3 locations available where 11919 11:34:54,722 --> 11:34:57,781 I can put my 1, my 2, and my 3. 11920 11:34:57,781 --> 11:35:01,502 But the catch is, now that\nwe're not using an array 11921 11:35:01,502 --> 11:35:05,851 we can't just naively assume that\n 11922 11:35:06,991 --> 11:35:10,441 Add 2 to an index, and boom\nyou're at the next, next number. 11923 11:35:10,442 --> 11:35:14,852 Now you have to leave these little\n 11924 11:35:14,851 --> 11:35:17,161 to lead from one to the other. 11925 11:35:17,161 --> 11:35:19,351 And sometimes, it might be\nclose, a few bytes away. 11926 11:35:19,351 --> 11:35:23,292 Maybe, it's a whole gigabyte away\n 11927 11:35:25,021 --> 11:35:30,252 Like where do these pointers\ngo, as you proposed? 11928 11:35:30,752 --> 11:35:32,822 All I have access to here are bytes. 11929 11:35:32,822 --> 11:35:34,891 I've already stored the\n1, the 2, and the 3. 11930 11:35:37,961 --> 11:35:40,851 So let me, you put the pointers\nright next to these numbers. 11931 11:35:40,851 --> 11:35:44,891 So let me at least plan ahead, so that\n 11932 11:35:44,891 --> 11:35:47,951 recall from last week, for some\nmemory, I don't just ask it now 11933 11:35:47,951 --> 11:35:49,856 for space for just the number. 11934 11:35:49,857 --> 11:35:51,732 Let me start getting\ninto the habit of asking 11935 11:35:51,732 --> 11:35:56,832 malloc for enough space for the number\n 11936 11:35:56,832 --> 11:35:59,542 So it's a little more aggressive\nof me to ask for more memory. 11937 11:36:00,822 --> 11:36:02,622 And here is an example of a trade off. 11938 11:36:02,622 --> 11:36:06,402 Almost any time in CS, when you start\n 11939 11:36:06,402 --> 11:36:10,662 Or if you try to conserve space,\nyou might have to lose time. 11940 11:36:10,661 --> 11:36:12,161 It's being that trade off there. 11941 11:36:14,391 --> 11:36:15,942 Well let me abstract this away. 11942 11:36:15,942 --> 11:36:19,057 And either next to or below, I'm\n 11943 11:36:19,057 --> 11:36:20,182 for the sake of discussion. 11944 11:36:20,182 --> 11:36:22,152 So the arrows are a bit prettier. 11945 11:36:22,152 --> 11:36:25,062 I've asked malloc for\nnow twice as much space 11946 11:36:25,061 --> 11:36:27,072 it would seem, than I previously needed. 11947 11:36:27,072 --> 11:36:31,016 But I'm going to use this second chunk\n 11948 11:36:31,016 --> 11:36:33,641 And I'm going to use this chunk\nof memory to refer to the next 11949 11:36:33,641 --> 11:36:35,451 essentially, stitching\nthis thing together. 11950 11:36:35,451 --> 11:36:37,511 So what should go in this first box? 11951 11:36:37,512 --> 11:36:41,082 Well, I claim the number, 0X456. 11952 11:36:41,082 --> 11:36:43,781 And it's written in hex because\nit represents a memory address. 11953 11:36:43,781 --> 11:36:47,801 But this is the equivalent of drawing\n 11954 11:36:47,802 --> 11:36:51,552 As a little check here, what\nshould go in this second box 11955 11:36:51,552 --> 11:36:55,422 if the goal is to stitch these\ntogether in order 1, 2, 3? 11956 11:36:55,421 --> 11:36:57,593 Feel free to just shout this out. 11957 11:36:59,052 --> 11:37:00,472 SPEAKER 1: OK, that worked well. 11958 11:37:01,396 --> 11:37:04,271 And you can't do that with the hands\n 11959 11:37:04,271 --> 11:37:08,511 So 0X789 should go here because that's\n 11960 11:37:08,512 --> 11:37:11,772 And then, we don't really have\nterribly many possibilities here. 11961 11:37:11,771 --> 11:37:14,441 This has to have a value, right. 11962 11:37:14,442 --> 11:37:19,312 Because at the end of the day, it's\n 11963 11:37:19,311 --> 11:37:22,651 So what value should go here,\nif this is the end of this list? 11964 11:37:23,652 --> 11:37:25,752 SPEAKER 1: So it could be 0X123. 11965 11:37:25,752 --> 11:37:29,531 The implication being that\nit would be a cyclical list. 11966 11:37:29,531 --> 11:37:32,051 Which is OK, but\npotentially problematic. 11967 11:37:32,052 --> 11:37:36,102 If any of you have accidentally\n 11968 11:37:36,101 --> 11:37:39,161 because you had an infinite loop,\n 11969 11:37:39,161 --> 11:37:43,811 to give yourself the accidental\nprobability of an infinite loop. 11970 11:37:43,811 --> 11:37:46,398 What might be simpler than\nthat and ward that off? 11971 11:37:48,612 --> 11:37:50,322 SPEAKER 1: So just the null character. 11972 11:37:50,322 --> 11:37:53,021 Not N-U-L, confusingly, which\nis at the end of strings. 11973 11:37:53,021 --> 11:37:56,031 But N-U-L-L, as we\nintroduced it last week. 11974 11:37:58,061 --> 11:38:00,881 So this is just a special value\nthat programmers decades ago 11975 11:38:00,881 --> 11:38:04,991 decided that if you store the address\n 11976 11:38:04,991 --> 11:38:07,902 There's never going to be\nanything useful at 0x0. 11977 11:38:07,902 --> 11:38:11,082 Therefore, it's a sentinel\nvalue, just a special value 11978 11:38:12,281 --> 11:38:14,351 There's nowhere further to go. 11979 11:38:14,351 --> 11:38:17,951 It's OK to come back to your\n 11980 11:38:17,951 --> 11:38:19,871 But we'd better be\nsmart enough to, maybe 11981 11:38:19,872 --> 11:38:23,862 remember where did the list start\nso that you can detect cycles. 11982 11:38:23,862 --> 11:38:26,421 If you start looping around\nin this structure, otherwise. 11983 11:38:26,921 --> 11:38:29,121 But these addresses, who really\ncares at the end of the day 11984 11:38:30,402 --> 11:38:32,302 It really just now looks like this. 11985 11:38:32,302 --> 11:38:35,260 And indeed, this is how most anyone\n 11986 11:38:35,260 --> 11:38:36,552 if having a discussion at work. 11987 11:38:36,552 --> 11:38:38,344 Talking about what data\nstructure we should 11988 11:38:38,343 --> 11:38:40,271 use to solve some problem\nin the real world. 11989 11:38:40,271 --> 11:38:42,521 We don't care generally\nabout the addresses. 11990 11:38:42,521 --> 11:38:45,111 We care that in code we can access them. 11991 11:38:45,112 --> 11:38:48,072 But in terms of the concept\nalone this would be, perhaps 11992 11:38:48,072 --> 11:38:49,720 the right way to think about this. 11993 11:38:49,720 --> 11:38:51,678 All right, let me pause\nhere and see if there's 11994 11:38:51,678 --> 11:38:55,901 any questions on this idea of creating\n 11995 11:38:55,902 --> 11:39:00,022 not just the numbers like 1,\n2, 3, but twice as much data. 11996 11:39:00,021 --> 11:39:02,591 So that you have little\nbreadcrumbs in the form of pointers 11997 11:39:02,591 --> 11:39:05,991 that can lead you from one to the next. 11998 11:39:05,991 --> 11:39:08,155 Any questions on these linked lists? 11999 11:39:14,913 --> 11:39:19,506 AUDIENCE: So does this takes\ntime more memory than an array? 12000 11:39:19,506 --> 11:39:21,631 SPEAKER 1: This does take\nmore memory than an array 12001 11:39:21,631 --> 11:39:24,180 because I now need space\nfor these pointers. 12002 11:39:24,180 --> 11:39:28,151 And to be clear, I technically\ndidn't really draw this to scale. 12003 11:39:28,152 --> 11:39:31,082 Thus far, in the class, we've\ngenerally thought about integers 12004 11:39:31,082 --> 11:39:33,992 like, 1, 2 and 3, as\nbeing 4 bytes, or 32 bits. 12005 11:39:33,991 --> 11:39:37,021 I made the claim last week that\non modern computer's pointers 12006 11:39:37,021 --> 11:39:40,051 tend to be 8 bytes or 64 bits. 12007 11:39:40,052 --> 11:39:42,762 So, technically, this box should\nactually be a little bigger. 12008 11:39:42,762 --> 11:39:44,461 It was just going to look a\nlittle stupid in the picture. 12009 11:39:45,811 --> 11:39:48,811 But, indeed, you're using\nmore space as a result. 12010 11:39:50,269 --> 11:39:51,601 SPEAKER 1: Oh, how does-- sorry. 12011 11:39:51,601 --> 11:39:55,451 How does the computer identify\nuseful data from used data? 12012 11:39:55,451 --> 11:39:58,261 So, for instance, garbage\nvalues or non-garbage values. 12013 11:39:58,262 --> 11:40:00,902 For now, think of that\nas the job of malloc. 12014 11:40:00,902 --> 11:40:04,292 So when you ask malloc for memory,\nas we started to last week 12015 11:40:04,292 --> 11:40:07,472 malloc keeps track of the\naddresses of the memory 12016 11:40:07,472 --> 11:40:10,442 it has handed to as valid values. 12017 11:40:10,442 --> 11:40:12,932 The other type of memory you\nuse, not just from the heap. 12018 11:40:12,932 --> 11:40:15,872 Because recall we briefly\ndiscussed that malloc uses space 12019 11:40:15,872 --> 11:40:18,872 from the heap, which was drawn at the\n 12020 11:40:18,872 --> 11:40:22,702 There's also stack memory, which is\n 12021 11:40:22,701 --> 11:40:25,201 And where all of the memory\nused by individual functions go. 12022 11:40:25,201 --> 11:40:27,534 And that was drawn in the\npicture is working its way up. 12023 11:40:27,535 --> 11:40:30,302 That's just an artist's\nrendition of direction. 12024 11:40:30,302 --> 11:40:33,662 The compiler, essentially,\nwill also help 12025 11:40:33,661 --> 11:40:37,349 keep track of which values are\nvalid or not inside of the stack. 12026 11:40:37,349 --> 11:40:39,391 Or really the underlying\ncode that you've written 12027 11:40:39,391 --> 11:40:40,724 will keep track of that for you. 12028 11:40:40,724 --> 11:40:43,691 So it's managed for you at that point. 12029 11:40:44,792 --> 11:40:46,522 Sorry it took me a bit to catch on. 12030 11:40:46,521 --> 11:40:48,691 So let's now translate\nthis to actual code. 12031 11:40:48,692 --> 11:40:52,262 How could we implement this idea\n 12032 11:40:52,262 --> 11:40:53,641 And that's a term of our NCS. 12033 11:40:53,641 --> 11:40:57,692 Whenever you have some data structure\n 12034 11:40:57,692 --> 11:41:00,429 N-O-D-E, is the generic term for that. 12035 11:41:00,428 --> 11:41:02,261 So each of these might\nbe said to be a node. 12036 11:41:03,311 --> 11:41:06,103 Well a couple of weeks ago, we saw\n 12037 11:41:06,103 --> 11:41:07,741 like a student or a candidate. 12038 11:41:07,741 --> 11:41:12,421 And a student, or rather a person,\n 12039 11:41:12,421 --> 11:41:14,161 And we used a few pieces of syntax here. 12040 11:41:14,161 --> 11:41:17,371 One, we use the struct keyword,\nwhich gives us a data structure. 12041 11:41:17,372 --> 11:41:21,902 We use typedef, which defines the\nname person to be our new data 12042 11:41:21,902 --> 11:41:24,332 type representing that whole structure. 12043 11:41:24,332 --> 11:41:26,432 So we probably have the\nright ingredients here 12044 11:41:26,432 --> 11:41:28,982 to build up this thing called a node. 12045 11:41:28,982 --> 11:41:32,102 And just to be clear, what should\n 12046 11:41:32,917 --> 11:41:35,042 It's not going to be a name\nor a number, obviously. 12047 11:41:35,042 --> 11:41:39,732 But what should a node have in\nterms of those fields, perhaps? 12048 11:41:41,107 --> 11:41:44,082 SPEAKER 1: So a number like a\nnumber and a pointer in some form. 12049 11:41:44,082 --> 11:41:46,332 So let's translate this to actual code. 12050 11:41:46,332 --> 11:41:51,092 So let's rename person to node\nto capture this notion here. 12051 11:41:52,347 --> 11:41:54,222 If it's just going to\nbe an int, that's fine. 12052 11:41:54,222 --> 11:41:56,461 We can just say int number,\nor int n, or whatever 12053 11:41:56,461 --> 11:41:58,862 you want to call that particular field. 12054 11:41:58,862 --> 11:42:00,554 The next one is a little non-obvious. 12055 11:42:00,553 --> 11:42:02,761 And this is where things\nget a little weird at first 12056 11:42:02,762 --> 11:42:05,312 but, in retrospect, it\nshould all fit together. 12057 11:42:05,311 --> 11:42:11,112 Let me propose that, ideally, we\n 12058 11:42:11,112 --> 11:42:13,412 And I could call the word\nnext anything I want. 12059 11:42:13,411 --> 11:42:17,591 Next just means what comes after\n 12060 11:42:17,591 --> 11:42:19,981 So a lot of CS people would\njust use next to represent 12061 11:42:22,741 --> 11:42:25,921 C and C compilers are\npretty naive, recall. 12062 11:42:25,921 --> 11:42:29,141 They only look at code top\nto bottom, left to right. 12063 11:42:29,141 --> 11:42:31,322 And any time they encounter\na word they have never 12064 11:42:31,322 --> 11:42:32,995 seen before, bad things happen. 12065 11:42:32,995 --> 11:42:34,412 Like, you can't compile your code. 12066 11:42:34,411 --> 11:42:36,401 You get some cryptic\nerror message or the like. 12067 11:42:36,402 --> 11:42:39,391 And that seems to be\nabout to happen here. 12068 11:42:39,391 --> 11:42:42,451 Because if the compiler is reading\nthis code from top to bottom 12069 11:42:42,451 --> 11:42:44,822 it's going to say, oh,\ninside of this struct 12070 11:42:44,822 --> 11:42:46,622 should be a variable called next. 12071 11:42:49,682 --> 11:42:52,951 Because it literally does\nnot find out until 2 lines 12072 11:42:52,951 --> 11:42:55,201 later, after that semicolon. 12073 11:42:55,201 --> 11:42:57,811 So the way to avoid this, which\nwe haven't quite seen before 12074 11:42:57,811 --> 11:43:02,701 is that you can temporarily name this\n 12075 11:43:02,701 --> 11:43:08,041 And then, down here inside of the\n 12076 11:43:08,042 --> 11:43:09,692 And then, you leave the rest alone. 12077 11:43:09,692 --> 11:43:14,102 This is a workaround this is\npossible because now you're 12078 11:43:14,101 --> 11:43:17,222 teaching the compiler, from\nthe first line, that here comes 12079 11:43:17,222 --> 11:43:19,442 a data structure called struct node. 12080 11:43:19,442 --> 11:43:22,902 Down here, you're shortening the name\n 12081 11:43:23,402 --> 11:43:26,485 It's just a little more convenient\n 12082 11:43:26,485 --> 11:43:30,241 But you do have to write struct\n 12083 11:43:30,241 --> 11:43:33,211 But that's OK because it's\nalready come into existence 12084 11:43:33,211 --> 11:43:35,373 now, as of that first line of code. 12085 11:43:35,374 --> 11:43:37,082 So that's the only\nfundamental difference 12086 11:43:37,082 --> 11:43:40,382 between what we did last week\nwith a person or a candidate. 12087 11:43:40,381 --> 11:43:45,371 We just now have to use this\nstruct workaround, syntactically. 12088 11:43:46,652 --> 11:43:50,491 AUDIENCE: So [INAUDIBLE] have like\n 12089 11:43:51,451 --> 11:43:56,551 SPEAKER 1: Why is the next variable\n 12090 11:43:58,631 --> 11:44:01,351 So think about the picture\nwe are trying to draw. 12091 11:44:01,351 --> 11:44:05,222 Technically, yes, each of these\narrows I deliberately drew 12092 11:44:07,982 --> 11:44:10,802 They need to point at the\nwhole data structure in memory. 12093 11:44:10,802 --> 11:44:13,082 Because the computer,\nultimately, and the compiler 12094 11:44:13,082 --> 11:44:16,952 in turn, needs to know that this\n 12095 11:44:18,521 --> 11:44:21,851 Inside of a node is a number\nand also another pointer. 12096 11:44:21,851 --> 11:44:24,252 So when you draw these\narrows, it would be 12097 11:44:24,252 --> 11:44:26,862 incorrect to point at just the number. 12098 11:44:26,862 --> 11:44:29,239 Because that throws\naway information that 12099 11:44:29,239 --> 11:44:31,572 would leave the compiler\nwondering, OK, I'm at a number. 12100 11:44:31,572 --> 11:44:32,682 Where the heck is the pointer? 12101 11:44:32,682 --> 11:44:34,932 You have to tell it that\nit's pointing at a whole node 12102 11:44:34,932 --> 11:44:38,338 so it knows a few bytes away\nis that corresponding pointer. 12103 11:44:40,665 --> 11:44:42,112 AUDIENCE: How do you [INAUDIBLE]. 12104 11:44:42,112 --> 11:44:43,444 SPEAKER 1: Really good question. 12105 11:44:43,444 --> 11:44:46,731 It would seem that just as\ncopying the array earlier 12106 11:44:46,732 --> 11:44:49,942 required twice as much memory,\n 12107 11:44:49,942 --> 11:44:52,612 So, technically, twice as much\nplus 1 for the new number. 12108 11:44:52,612 --> 11:44:56,002 Here, too, it looks like we're\nusing twice as much memory, also. 12109 11:44:56,002 --> 11:44:58,881 And to my comment earlier, it's\n 12110 11:44:58,881 --> 11:45:02,752 because these pointers are 8 bytes, and\n 12111 11:45:04,762 --> 11:45:08,391 In the context of the array, you\n 12112 11:45:08,391 --> 11:45:10,231 So, yes, you needed\ntwice as much memory. 12113 11:45:10,232 --> 11:45:13,082 But then you were quickly\nfreeing the original array. 12114 11:45:13,082 --> 11:45:16,372 So you weren't consuming long-term,\n 12115 11:45:16,372 --> 11:45:19,772 The difference here, too, is\nthat, as we'll see in a moment 12116 11:45:19,771 --> 11:45:23,151 it turns out it's going to be\n 12117 11:45:23,152 --> 11:45:25,101 to insert new numbers in here. 12118 11:45:25,101 --> 11:45:28,101 Because I'm not going to have\nto do a huge amount of copying. 12119 11:45:28,101 --> 11:45:31,281 And even though I might still have\n 12120 11:45:31,281 --> 11:45:33,561 is going to take some\namount of time, I'm 12121 11:45:33,561 --> 11:45:36,951 not going to have to be asking for\n 12122 11:45:36,951 --> 11:45:40,671 And certain operations in the computer,\n 12123 11:45:40,671 --> 11:45:42,481 back memory, tends to be slower. 12124 11:45:42,482 --> 11:45:44,340 So we get to avoid\nthat situation as well. 12125 11:45:44,339 --> 11:45:46,131 There's going to be\nsome downsides, though. 12126 11:45:47,182 --> 11:45:51,241 But we'll see in a bit just what some\n 12127 11:45:51,741 --> 11:45:56,222 So from here, if we go back to the\n 12128 11:45:56,222 --> 11:45:59,302 let's start to now build up a\nlinked list with some actual code. 12129 11:45:59,302 --> 11:46:03,682 How do you go about, in C,\nrepresenting a linked list in code? 12130 11:46:03,682 --> 11:46:06,262 Well, at the moment, it would\nactually be as simple as this. 12131 11:46:06,262 --> 11:46:09,412 You declare a variable,\ncalled list, for instance. 12132 11:46:09,411 --> 11:46:12,451 That itself stores\nthe address of a node. 12133 11:46:14,701 --> 11:46:17,362 So if you want to store\na linked list in memory 12134 11:46:17,362 --> 11:46:19,879 you just create a variable\ncalled list, or whatever else. 12135 11:46:19,879 --> 11:46:21,711 And you just say that\nthis variable is going 12136 11:46:21,711 --> 11:46:25,911 to be pointing at the first node in a\n 12137 11:46:25,911 --> 11:46:29,751 Because malloc is ultimately going\n 12138 11:46:29,752 --> 11:46:33,752 get at any one particular\nnode in memory. 12139 11:46:34,252 --> 11:46:36,171 So let's actually do\nthis in pictorial form. 12140 11:46:36,171 --> 11:46:39,171 When you write a line of\ncode, like I just did here-- 12141 11:46:39,171 --> 11:46:43,161 and I do not initialize it to\n 12142 11:46:44,211 --> 11:46:48,201 It does exist in memory as a box,\n 12143 11:46:48,201 --> 11:46:50,911 But I've deliberately\ndrawn Oscar inside of it. 12144 11:46:54,112 --> 11:46:55,444 SPEAKER 1: It's a garbage value. 12145 11:46:55,444 --> 11:46:59,881 I have been allocated the\nvariable in memory, called list. 12146 11:46:59,881 --> 11:47:03,951 Which is going to give me 64 bits\n 12147 11:47:04,951 --> 11:47:07,701 But if I myself have not\nused the assignment operator 12148 11:47:07,701 --> 11:47:11,311 it's not going to get magically\n 12149 11:47:11,811 --> 11:47:13,951 It's not going to even give me a node. 12150 11:47:13,951 --> 11:47:18,631 This is literally just going to be an\n 12151 11:47:18,631 --> 11:47:20,241 So what would be a solution here? 12152 11:47:20,241 --> 11:47:23,241 Suppose that I'm beginning\nto create my linked list 12153 11:47:23,241 --> 11:47:24,771 but I don't have any nodes yet. 12154 11:47:24,771 --> 11:47:28,783 What would be a sensible thing to\n 12155 11:47:31,319 --> 11:47:32,612 SPEAKER 1: So just null, right. 12156 11:47:32,612 --> 11:47:34,342 When in doubt with\npointers, generally it's 12157 11:47:34,341 --> 11:47:36,091 a good thing to\ninitialize things to null 12158 11:47:36,091 --> 11:47:37,641 so at least it's not a garbage value. 12159 11:47:39,900 --> 11:47:41,692 But it's a special\nvalue you can then check 12160 11:47:41,692 --> 11:47:43,622 for with a conditional, or the like. 12161 11:47:43,622 --> 11:47:47,602 So this might be a better\nway to create a linked list 12162 11:47:47,601 --> 11:47:51,601 even before you've inserted any\nnumbers into the thing itself. 12163 11:47:52,101 --> 11:47:55,316 So after that, how can we go about\n 12164 11:47:55,317 --> 11:47:56,692 So now the story looks like this. 12165 11:47:56,692 --> 11:47:59,632 Oscar is gone because inside\nof this box is all zero bits. 12166 11:47:59,631 --> 11:48:03,531 Just because it's nice and clean, and\n 12167 11:48:03,531 --> 11:48:08,072 Well, if I want to add the number 1\n 12168 11:48:08,072 --> 11:48:10,072 Well, perhaps I could\nstart with code like this. 12169 11:48:10,072 --> 11:48:11,781 Borrowing inspiration from last week. 12170 11:48:11,781 --> 11:48:16,402 Let's ask malloc for enough\nspace for the size of a node. 12171 11:48:16,402 --> 11:48:20,542 And this gets to your question earlier,\n 12172 11:48:20,542 --> 11:48:23,842 I don't just need space for an int and\n 12173 11:48:24,921 --> 11:48:27,631 And I gave that thing a name, node. 12174 11:48:27,631 --> 11:48:30,411 So size of node figures out\nand does the arithmetic for me. 12175 11:48:30,411 --> 11:48:32,871 And gives me back the\nright number of bytes. 12176 11:48:32,872 --> 11:48:36,412 This, then, stores the address\nof that chunk of memory 12177 11:48:36,411 --> 11:48:38,361 in what I'll temporarily called n. 12178 11:48:38,362 --> 11:48:40,641 Just to represent a generic new node. 12179 11:48:42,351 --> 11:48:45,561 Because just like last week when I\n 12180 11:48:45,561 --> 11:48:47,841 and I stored it in an int* pointer. 12181 11:48:47,841 --> 11:48:50,241 This week, if I'm asking\nfor memory for a node 12182 11:48:50,241 --> 11:48:52,822 I'm storing it in a node* pointer. 12183 11:48:52,822 --> 11:48:56,002 So technically, nothing new\nthere except for this new term 12184 11:48:56,002 --> 11:48:58,502 of art in data structure called node. 12185 11:48:59,002 --> 11:49:00,351 So what does that do for me? 12186 11:49:00,351 --> 11:49:03,141 It essentially draws a\npicture like this in memory. 12187 11:49:03,141 --> 11:49:07,171 I still have my list variable from\n 12188 11:49:07,671 --> 11:49:09,129 And that's why I've drawn it blank. 12189 11:49:09,129 --> 11:49:11,541 I also now have a\ntemporary variable called 12190 11:49:11,542 --> 11:49:15,052 n, which I initialize to\nthe return value of malloc. 12191 11:49:15,052 --> 11:49:17,132 Which gave me one of\nthese nodes in memory. 12192 11:49:17,131 --> 11:49:19,612 But I've drawn it having\ngarbage values, too 12193 11:49:19,612 --> 11:49:21,332 because I don't know what int is there. 12194 11:49:21,332 --> 11:49:22,790 I don't know what pointer is there. 12195 11:49:22,790 --> 11:49:27,082 It's garbage values because malloc does\n 12196 11:49:27,082 --> 11:49:28,732 There is another function for that. 12197 11:49:28,732 --> 11:49:31,582 But malloc alone just says,\nsure, use this chunk of memory. 12198 11:49:31,582 --> 11:49:33,391 Deal with whatever is there. 12199 11:49:33,391 --> 11:49:36,381 So how can I go about\ninitializing this to known values? 12200 11:49:36,381 --> 11:49:40,921 Well, suppose I want to insert the\n 12201 11:49:40,921 --> 11:49:44,693 A list of size 1, I could\ndo something like this. 12202 11:49:44,694 --> 11:49:47,402 And this is where you have to\n 12203 11:49:47,402 --> 11:49:51,542 My conditional here is asking the\n 12204 11:49:51,542 --> 11:49:54,692 So that is, if malloc\ngave me valid memory 12205 11:49:54,692 --> 11:49:58,172 and I don't have to quit altogether\n 12206 11:49:58,171 --> 11:50:02,072 If n does not equal null, but\nis equal to valid address 12207 11:50:02,072 --> 11:50:03,552 I'm going to go ahead and do this. 12208 11:50:03,552 --> 11:50:06,302 And this is cryptic looking syntax now. 12209 11:50:06,302 --> 11:50:09,632 But does someone want to take a stab\n 12210 11:50:13,862 --> 11:50:18,002 How might you explain what that\ninner line of code is doing? *n. 12211 11:50:27,283 --> 11:50:29,641 The place that n is pointing\nto, set it equal to 1. 12212 11:50:29,641 --> 11:50:33,542 Or using the vernacular of going\nthere, go to the address in n 12213 11:50:33,542 --> 11:50:35,961 and set it's number field to 1. 12214 11:50:35,961 --> 11:50:37,961 However you want to think\nabout it, that's fine. 12215 11:50:37,961 --> 11:50:40,411 But the * again is the\ndereference operator here. 12216 11:50:40,411 --> 11:50:42,211 And we're doing the\nparentheses, which we 12217 11:50:42,211 --> 11:50:45,722 haven't needed to do before because we\n 12218 11:50:45,722 --> 11:50:47,491 structures together until today. 12219 11:50:47,491 --> 11:50:49,862 This just means go there first. 12220 11:50:49,862 --> 11:50:52,201 And then once you're\nthere, go access number. 12221 11:50:52,201 --> 11:50:54,311 You don't want to do one\nthing before the other. 12222 11:50:54,311 --> 11:50:56,371 So this is just enforcing\norder of operations. 12223 11:50:56,372 --> 11:50:58,782 The parentheses just like\nin grade school math. 12224 11:50:59,281 --> 11:51:00,692 So this line of code is cryptic. 12225 11:51:01,463 --> 11:51:03,421 It's not something most\npeople easily remember. 12226 11:51:03,421 --> 11:51:07,231 Thankfully, there's that syntactic\n 12227 11:51:08,338 --> 11:51:10,171 And this, even though\nit's new to you today 12228 11:51:10,171 --> 11:51:12,301 should eventually feel\na little more familiar. 12229 11:51:12,302 --> 11:51:15,692 Because this now is shorthand\nnotation for saying, start at n. 12230 11:51:15,692 --> 11:51:17,891 Go there as by following the arrow. 12231 11:51:17,891 --> 11:51:20,012 And when you get there,\nchange the number field. 12232 11:51:22,201 --> 11:51:24,721 So most people would not\nwrite code like this. 12233 11:51:25,512 --> 11:51:26,912 It's a couple extra keystrokes. 12234 11:51:26,911 --> 11:51:30,781 This just looks more like the artist's\n 12235 11:51:30,781 --> 11:51:35,012 And how most CS people would think about\n 12236 11:51:37,775 --> 11:51:42,132 The picture now, after setting number to\n 12237 11:51:42,131 --> 11:51:43,921 So there's still one step missing. 12238 11:51:43,921 --> 11:51:46,201 And that's, of course, to\ninitialize, it would seem 12239 11:51:46,201 --> 11:51:50,561 the pointer in this new node\nto something known like null. 12240 11:51:50,561 --> 11:51:52,216 So I bet we could do this like this. 12241 11:51:52,216 --> 11:51:54,091 With a different line\nof code, I'm just going 12242 11:51:54,091 --> 11:52:00,361 to say if n does not equal null,\n 12243 11:52:00,362 --> 11:52:04,022 Or more pedantically, go\nto n, follow the arrow 12244 11:52:04,021 --> 11:52:07,921 and then update the next field\n 12245 11:52:07,921 --> 11:52:10,171 And again, this is just\ndoing some nice bookkeeping. 12246 11:52:10,171 --> 11:52:13,351 Technically speaking,\nwe might not need to set 12247 11:52:13,351 --> 11:52:16,391 this to null if we're going to keep\n 12248 11:52:16,391 --> 11:52:19,591 But I'm doing it step-by-step so\n 12249 11:52:19,591 --> 11:52:23,281 And there's no bugs in\nmy code at this point. 12250 11:52:24,752 --> 11:52:27,211 There's one last thing I'm\ngoing to have to do here. 12251 11:52:27,211 --> 11:52:32,432 If the goal, ultimately, was to insert\n 12252 11:52:32,432 --> 11:52:36,342 what's the last step I\nshould, perhaps, do here? 12253 11:52:38,031 --> 11:52:40,741 AUDIENCE: Set the pointer value to null. 12254 11:52:41,491 --> 11:52:45,451 I now need to update the actual\n 12255 11:52:45,451 --> 11:52:48,511 list, to point at this brand new node. 12256 11:52:48,512 --> 11:52:52,798 That is now perfectly initialized as\n 12257 11:52:52,798 --> 11:52:54,881 Yeah, technically, this\nis already pointing there. 12258 11:52:54,881 --> 11:52:57,572 But I describe this deliberately\nearlier as being temporary. 12259 11:52:57,572 --> 11:53:02,101 I just needed this to get it back from\n 12260 11:53:02,101 --> 11:53:04,711 This is the long term\nvariable I care about. 12261 11:53:04,711 --> 11:53:06,961 So I'm going to want to do\nsomething simple like this. 12262 11:53:09,002 --> 11:53:11,345 And this seems a little\nweird that list equals n. 12263 11:53:11,345 --> 11:53:13,262 But again, think about\nwhat's inside this box. 12264 11:53:13,262 --> 11:53:15,470 At the moment this is null\nbecause there is no linked 12265 11:53:15,470 --> 11:53:17,012 list at the beginning of our story. 12266 11:53:17,012 --> 11:53:21,391 N is the address of the beginning, and\n 12267 11:53:21,391 --> 11:53:24,781 So it stands to reason that\nif you set list equal to n 12268 11:53:24,781 --> 11:53:27,661 that has the effect of\ncopying this address up here. 12269 11:53:27,661 --> 11:53:30,764 Or really just copying the\narrow into that same location 12270 11:53:30,764 --> 11:53:32,432 so that now the picture looks like this. 12271 11:53:32,432 --> 11:53:35,822 And heck, if this was a temporary\n 12272 11:53:35,822 --> 11:53:37,351 And now, this is the picture. 12273 11:53:37,351 --> 11:53:39,512 So an annoying number\nof steps, certainly 12274 11:53:39,512 --> 11:53:42,002 to walk through verbally like this. 12275 11:53:42,002 --> 11:53:44,162 But it's just malloc to\ngive yourself a node 12276 11:53:44,161 --> 11:53:49,411 initialize the 2 fields inside of\n 12277 11:53:50,252 --> 11:53:52,391 I didn't have to copy anything. 12278 11:53:52,391 --> 11:53:55,614 I just had to insert\nsomething in this case. 12279 11:53:55,614 --> 11:53:58,322 Let me pause here to see if there's\n 12280 11:53:58,322 --> 11:54:02,271 And we'll see before long it all\n 12281 11:54:02,271 --> 11:54:06,447 AUDIENCE: So if the\nstatements [INAUDIBLE].. 12282 11:54:07,072 --> 11:54:10,491 I drew them separately just\nfor the sake of the voiceover 12283 11:54:10,491 --> 11:54:12,502 of doing each thing very methodically. 12284 11:54:12,502 --> 11:54:14,572 In real code, as we'll\ntransition to now 12285 11:54:14,572 --> 11:54:16,701 I could have and should\nhave just done it 12286 11:54:16,701 --> 11:54:20,481 all inside of one conditional after\n 12287 11:54:20,482 --> 11:54:22,792 I could set number to a value like 1. 12288 11:54:22,792 --> 11:54:25,897 And I could set the pointer\nitself to something like null. 12289 11:54:26,512 --> 11:54:30,082 Well let's translate, then,\nthis into some similar code 12290 11:54:30,082 --> 11:54:34,822 that allows us to build up a linked\n 12291 11:54:35,631 --> 11:54:37,381 But now, using this new primitive. 12292 11:54:37,381 --> 11:54:39,621 So I'm going to go\nback into VS Code here. 12293 11:54:39,622 --> 11:54:42,952 I'm going to go ahead now and delete\n 12294 11:54:44,752 --> 11:54:49,951 And now, inside of my main function,\n 12295 11:54:49,951 --> 11:54:53,661 I'm going to first give\nmyself a list of size 0. 12296 11:54:53,661 --> 11:54:56,091 And I'm going to call that node* list. 12297 11:54:56,091 --> 11:54:59,091 And I'm going to initialize that\n 12298 11:54:59,091 --> 11:55:02,241 But I'm also now going to have to\n 12299 11:55:03,451 --> 11:55:06,981 So recall that I might do something\nlike typedef, struct node. 12300 11:55:06,982 --> 11:55:09,802 Inside of this struct node, I'm\ngoing to have a number, which 12301 11:55:09,802 --> 11:55:11,492 I'll call number of type int. 12302 11:55:11,491 --> 11:55:13,641 And I'm going to have\na structure called node 12303 11:55:13,641 --> 11:55:16,951 with a * that says the next\npointer is called next. 12304 11:55:16,951 --> 11:55:20,631 And I'm going to call this whole\nthing, more succinctly, node 12305 11:55:22,311 --> 11:55:25,401 Now as an aside, for those of you\n 12306 11:55:27,082 --> 11:55:29,932 Technically, I could\ndo something like this. 12307 11:55:29,932 --> 11:55:33,442 Not use typedef and not\nuse the word node alone. 12308 11:55:33,442 --> 11:55:37,162 This syntax here would actually\ncreate for me a new data 12309 11:55:37,161 --> 11:55:40,311 type called, verbosely, struct node. 12310 11:55:40,311 --> 11:55:42,921 And I could use this throughout\nmy code saying struct node. 12311 11:55:43,942 --> 11:55:45,322 That just gets a little tedious. 12312 11:55:45,322 --> 11:55:48,197 And it would be nicer just to refer\n 12313 11:55:49,232 --> 11:55:51,712 So what typedef has\nbeen doing for us is it 12314 11:55:51,711 --> 11:55:55,252 again, lets us invent our own\nword that's even more succinct. 12315 11:55:55,252 --> 11:55:58,521 And this just has the effect\nnow of calling this whole thing 12316 11:55:58,521 --> 11:56:02,241 node without the need, subsequently, to\n 12317 11:56:04,161 --> 11:56:07,531 So now that this thing exists in\n 12318 11:56:10,252 --> 11:56:12,921 And to do this, I'm going to\ngive myself a temporary variable. 12319 11:56:12,921 --> 11:56:14,822 I'll call it n for consistency. 12320 11:56:14,822 --> 11:56:18,021 I'm going to use malloc to\ngive myself the size of a node 12321 11:56:19,561 --> 11:56:21,021 And then, I'm going to\ndo a little safety check. 12322 11:56:21,021 --> 11:56:23,951 If n equals equals null, I'm going\n 12323 11:56:23,951 --> 11:56:25,701 I'm just going to quit\nout of this program 12324 11:56:25,701 --> 11:56:28,441 because there's nothing useful\nto be done at this point. 12325 11:56:28,442 --> 11:56:31,052 But most likely my computer is\nnot going to run out of memory. 12326 11:56:31,052 --> 11:56:34,232 So I'm going to assume we can keep\n 12327 11:56:34,232 --> 11:56:38,872 If n does not equal null, and that\n 12328 11:56:40,851 --> 11:56:42,411 I'm going to build this up backwards. 12329 11:56:44,188 --> 11:56:45,771 That's OK, let's go ahead and do this. 12330 11:56:48,082 --> 11:56:52,972 And then n [arrow next] equals null. 12331 11:56:52,972 --> 11:56:59,902 And now, update list to point\nto new node, list equals n. 12332 11:56:59,902 --> 11:57:02,062 So at this point in the\nstory, we've essentially 12333 11:57:02,061 --> 11:57:06,811 constructed what was that first\npicture, which looks like this. 12334 11:57:06,811 --> 11:57:11,362 This is the corresponding code via\n 12335 11:57:11,362 --> 11:57:14,342 Suppose now, we want to add\nthe number 2 to the list. 12336 11:57:21,391 --> 11:57:23,811 Well, I don't need to\nredeclare n because I can use 12337 11:57:23,811 --> 11:57:25,591 the same temporary variables before. 12338 11:57:25,591 --> 11:57:30,791 So this time, I'm just going to say n\n 12339 11:57:30,792 --> 11:57:32,542 I'm, again, going to\nhave my safety check. 12340 11:57:32,542 --> 11:57:36,772 So if n equals equals null, then let's\n 12341 11:57:36,771 --> 11:57:41,301 But, I have to be a\nlittle more careful now. 12342 11:57:41,302 --> 11:57:43,641 Technically speaking,\nwhat do I still need 12343 11:57:43,641 --> 11:57:48,021 to do before I quit out of my\nprogram to be really proper? 12344 11:57:48,021 --> 11:57:51,361 Free the memory that did\nsucceed a little higher up. 12345 11:57:51,362 --> 11:57:56,762 So I think it suffices to free what\n 12346 11:57:57,262 --> 11:58:03,742 Now, if all was well, though, let's\n 12347 11:58:03,741 --> 11:58:09,322 And now, n [arrow next] equals null. 12348 11:58:09,322 --> 11:58:12,381 And now, let's go ahead\nand add it to the list. 12349 11:58:12,381 --> 11:58:20,391 If I go ahead and do\nlist arrow next equals n 12350 11:58:20,391 --> 11:58:24,141 I think what we've just done is\nbuild up the equivalent, now 12351 11:58:24,141 --> 11:58:27,141 of this in the computer's memory. 12352 11:58:27,141 --> 11:58:29,661 By going to the list\nfield's next field, which 12353 11:58:29,661 --> 11:58:33,561 is synonymous with the 1\nnodes, bottom-most box. 12354 11:58:33,561 --> 11:58:37,021 And store the address of what was n,\n 12355 11:58:37,021 --> 11:58:39,871 And I'm just throwing away, in the\n 12356 11:58:42,362 --> 11:58:47,569 Let me go down here and say, add\n 12357 11:58:49,822 --> 11:58:52,762 And clearly, in a real program, we\n 12358 11:58:52,762 --> 11:58:56,542 And do this dynamically or a function\n 12359 11:58:56,542 --> 11:58:59,601 But just to go through the\nsyntax here, this is fine. 12360 11:58:59,601 --> 11:59:03,182 If n equals equals null, out\nof memory for some reason. 12361 11:59:03,182 --> 11:59:09,131 Let's return 1, but we\nshould free the list itself 12362 11:59:09,131 --> 11:59:12,932 and even the second node, list [next]. 12363 11:59:12,932 --> 11:59:16,211 But I've deliberately done this poorly. 12364 11:59:16,711 --> 11:59:18,722 This is a little more subtle now. 12365 11:59:18,722 --> 11:59:22,052 And let me get rid of the highlighting\n 12366 11:59:22,052 --> 11:59:26,372 If n happens to equal equal\nnull, and something really just 12367 11:59:26,372 --> 11:59:32,522 went wrong they're out of memory,\n 12368 11:59:32,521 --> 11:59:35,252 And again, it's not that I'm\nfreeing those variables per se. 12369 11:59:35,252 --> 11:59:39,101 I'm freeing the addresses\nat in those variables. 12370 11:59:39,101 --> 11:59:41,371 But there's also a\nbug with my code here. 12371 11:59:45,061 --> 11:59:49,165 This line here, 43, what is\nthat freeing specifically? 12372 11:59:49,832 --> 11:59:52,382 AUDIENCE: You're freeing list 2 times. 12373 11:59:52,381 --> 11:59:54,121 SPEAKER 1: I'm freeing, not so. 12374 11:59:54,631 --> 11:59:56,221 I'm not freeing list 2 times. 12375 11:59:56,222 --> 11:59:59,012 Technically, I'm freeing\nlist once and list next once. 12376 11:59:59,012 --> 12:00:01,082 But let me just ask the\nmore explicit question. 12377 12:00:01,082 --> 12:00:03,902 What am I freeing with\nline 43 at the moment? 12378 12:00:08,911 --> 12:00:10,921 Because if 1 is at the\nbeginning of the list 12379 12:00:10,921 --> 12:00:14,011 list contains the address\nof that number 1 node. 12380 12:00:14,012 --> 12:00:15,762 And so this frees that node. 12381 12:00:15,762 --> 12:00:18,732 This line of code, you might\nthink now intuitively, OK 12382 12:00:18,732 --> 12:00:21,092 it's probably freeing the node number 2. 12383 12:00:22,891 --> 12:00:24,601 Valgrind might help you catch this. 12384 12:00:24,601 --> 12:00:27,002 But by eyeing it, it's\nnot necessarily obvious. 12385 12:00:27,002 --> 12:00:31,472 You should never touch memory\nthat you have already freed. 12386 12:00:31,472 --> 12:00:34,412 And so, the fact that I did\nin this order, very bad. 12387 12:00:34,411 --> 12:00:37,111 Because I'm telling the\noperating system, I don't know. 12388 12:00:37,112 --> 12:00:39,631 I don't need the list address anymore. 12389 12:00:40,891 --> 12:00:43,141 And then, literally one line later,\n 12390 12:00:43,141 --> 12:00:45,211 Let me actually go to\nthat address for a moment 12391 12:00:45,211 --> 12:00:47,881 and look at the next\nfield of that first node. 12392 12:00:48,701 --> 12:00:51,191 You've already given up\ncontrol over the node. 12393 12:00:51,192 --> 12:00:54,211 So it's an easy fix in\nthis case, logically. 12394 12:00:54,211 --> 12:00:56,851 But we should be freeing\nthe second node first 12395 12:00:56,851 --> 12:01:00,542 and then the first one\nso that we're doing it 12396 12:01:00,542 --> 12:01:02,522 in, essentially, reverse order. 12397 12:01:02,521 --> 12:01:04,438 And again, Valgrind would\nhelp you catch that. 12398 12:01:04,438 --> 12:01:07,063 But that's the kind of thing one\nneeds to be careful about when 12399 12:01:08,082 --> 12:01:10,592 You cannot touch memory\nafter you freed it. 12400 12:01:12,451 --> 12:01:17,971 Let me go ahead and update\nthe number field of n to be 3. 12401 12:01:17,972 --> 12:01:20,982 The next node of n to be null. 12402 12:01:20,982 --> 12:01:22,772 And then, just like\nin the slide earlier 12403 12:01:22,771 --> 12:01:28,502 I think I can do list\nnext, next equals n. 12404 12:01:28,502 --> 12:01:32,372 And that has the effect now of\n 12405 12:01:32,372 --> 12:01:34,472 essentially, this data structure. 12406 12:01:36,302 --> 12:01:38,342 Like, in a better world, we'd\nhave a loop and some functions 12407 12:01:38,341 --> 12:01:39,901 that are automating this process. 12408 12:01:39,902 --> 12:01:44,162 But, for now, we're doing it just\n 12409 12:01:44,161 --> 12:01:48,901 So at this point, unfortunately,\n 12410 12:01:48,902 --> 12:01:53,671 It's no longer as easy as int\ni equals 0, i less than 3, i++. 12411 12:01:53,671 --> 12:02:00,902 Because you cannot just\ndo something like this. 12412 12:02:00,902 --> 12:02:06,002 Because pointer arithmetic\nno longer comes into play 12413 12:02:06,002 --> 12:02:10,232 when it's you, who are stitching\n 12414 12:02:10,232 --> 12:02:12,932 In all of our past examples\nwith arrays, you've 12415 12:02:12,932 --> 12:02:16,302 been trusting that all of the bytes in\n 12416 12:02:16,302 --> 12:02:19,015 So it's perfectly reasonable for\nthe compiler and the computer 12417 12:02:19,014 --> 12:02:21,932 to just figure out, oh, well if you\n 12418 12:02:21,932 --> 12:02:23,612 [1], it's one location over. 12419 12:02:23,612 --> 12:02:25,592 [2], it's one location over. 12420 12:02:25,591 --> 12:02:28,511 This is way less obvious now. 12421 12:02:28,512 --> 12:02:32,132 Because even though you might want to\n 12422 12:02:32,131 --> 12:02:36,752 list, or the second, or the third, you\n 12423 12:02:38,072 --> 12:02:41,521 Instead, you have to\nfollow all of those arrows. 12424 12:02:41,521 --> 12:02:44,822 So with linked lists, you can't use\n 12425 12:02:44,822 --> 12:02:47,792 because one node might be here,\nover here, over here, over here. 12426 12:02:47,792 --> 12:02:51,031 You can't just use some simple offset. 12427 12:02:51,031 --> 12:02:53,822 So I think our code is going\nto have to be a little fancier. 12428 12:02:53,822 --> 12:02:57,302 And this might look scary at\nfirst, but it's just an application 12429 12:02:57,302 --> 12:02:59,641 of some of the basic definitions here. 12430 12:02:59,641 --> 12:03:06,961 Let me do a for-loop that actually\n 12431 12:03:08,612 --> 12:03:13,262 I'm going to keep doing this, so\n 12432 12:03:13,262 --> 12:03:15,842 And on each iteration\nof this loop, I'm going 12433 12:03:15,841 --> 12:03:20,581 to update TMP to be\nwhatever TMP arrow next is. 12434 12:03:20,582 --> 12:03:23,192 And I'll remind you in a moment\nand explain in more detail. 12435 12:03:23,192 --> 12:03:27,211 But when I print something here\nwith printf, I can still use %i. 12436 12:03:27,211 --> 12:03:29,521 Because it's still a number\nat the end of the day. 12437 12:03:29,521 --> 12:03:34,121 But what I want to print out is the\n 12438 12:03:34,122 --> 12:03:36,514 So maybe the ugliest\nfor-loop we've ever seen. 12439 12:03:36,514 --> 12:03:38,972 Because it's mixing, not just\nthe idea of a for-loop, which 12440 12:03:38,972 --> 12:03:40,982 itself was a bit cryptic weeks ago. 12441 12:03:40,982 --> 12:03:43,507 But now, I'm using pointers\ninstead of integers. 12442 12:03:43,506 --> 12:03:45,631 But I'm not violating the\ndefinition of a for-loop. 12443 12:03:45,631 --> 12:03:48,421 Recall that a for-loop has 3\nmain things in parentheses. 12444 12:03:48,421 --> 12:03:50,281 What do you want to initialize first? 12445 12:03:50,281 --> 12:03:53,222 What condition do you want to\nkeep checking again and again? 12446 12:03:53,222 --> 12:03:56,921 And what update do you want to make\n 12447 12:03:56,921 --> 12:03:59,341 So with that basic\ndefinition in mind, this 12448 12:03:59,341 --> 12:04:01,831 is giving me a temporary\nvariable called TMP 12449 12:04:01,832 --> 12:04:04,002 that is initialized to\nthe beginning of the loop. 12450 12:04:04,002 --> 12:04:07,591 So it's like pointing my\nfinger at the number 1 node. 12451 12:04:07,591 --> 12:04:11,011 Then, I'm asking the question,\ndoes TMP not equal null? 12452 12:04:11,012 --> 12:04:13,652 Well, hopefully, not because\nI'm pointing at a valid node 12453 12:04:15,192 --> 12:04:17,012 So, of course, it\ndoesn't equal null yet. 12454 12:04:17,012 --> 12:04:19,512 Null won't be until we get\nto the end of the list. 12455 12:04:21,012 --> 12:04:22,742 I started this TMP variable. 12456 12:04:22,741 --> 12:04:27,752 I follow the arrow and go to\nthe number field they're in. 12457 12:04:28,832 --> 12:04:32,492 The for-loop says,\nchange TMP to be whatever 12458 12:04:32,491 --> 12:04:36,572 is at TMP, by following the arrow\nand grabbing the next field. 12459 12:04:36,572 --> 12:04:39,741 That, then, has the result of being\n 12460 12:04:39,741 --> 12:04:42,241 No, of course, it doesn't equal\nnull because the second node 12461 12:04:43,531 --> 12:04:45,402 Null is still at the very end. 12462 12:04:45,402 --> 12:04:47,192 So I print out the number 2. 12463 12:04:47,192 --> 12:04:51,152 Next step, I update TMP one more\ntime to be whatever is next. 12464 12:04:51,152 --> 12:04:53,711 That, then, does not yet equal null. 12465 12:04:53,711 --> 12:04:55,951 So I go ahead and print\nout the number 3 node. 12466 12:04:55,951 --> 12:05:01,601 Then one last time, I update TMP to\n 12467 12:05:01,601 --> 12:05:05,461 But after 1, 2, 3, that\nlast next field is null. 12468 12:05:05,461 --> 12:05:09,271 And so, I break out of\nthis for-loop altogether. 12469 12:05:09,271 --> 12:05:12,211 So if I do this in\npictorial form, all we're 12470 12:05:12,211 --> 12:05:15,781 doing, if I now use my finger\nto represent the TMP variable. 12471 12:05:15,781 --> 12:05:19,561 I initialize TMP to be whatever\nlist is, so it points here. 12472 12:05:19,561 --> 12:05:22,261 That's obviously not null\nso I print out whatever 12473 12:05:22,262 --> 12:05:26,582 is that TMP, follow the arrow\nin number, and I print that out. 12474 12:05:26,582 --> 12:05:28,772 Then I update TMP to point here. 12475 12:05:28,771 --> 12:05:30,558 Then I update TMP to point here. 12476 12:05:30,559 --> 12:05:31,891 Then I update TMP to point here. 12477 12:05:34,961 --> 12:05:39,152 So, again, admittedly much more cryptic\n 12478 12:05:40,091 --> 12:05:46,336 But it's just a different\nutilization of the for-loop syntax. 12479 12:05:46,836 --> 12:05:50,622 AUDIENCE: How does it happen that\n 12480 12:05:50,622 --> 12:05:52,500 Because it seems to me that addresses- 12481 12:05:53,542 --> 12:05:56,542 How is it that I'm actually printing\n 12482 12:05:57,921 --> 12:05:59,601 The compiler is helping me here. 12483 12:05:59,601 --> 12:06:02,211 Because I taught it, in the\nvery beginning of my program 12484 12:06:05,211 --> 12:06:08,991 The compiler knows that a node has\n 12485 12:06:10,911 --> 12:06:16,891 Because I'm iterating using a node*\n 12486 12:06:16,891 --> 12:06:19,641 the compiler knows that any\ntime I'm pointing at something 12487 12:06:19,641 --> 12:06:21,421 I'm pointing at the whole node. 12488 12:06:21,421 --> 12:06:24,502 Doesn't matter where specifically in\n 12489 12:06:24,502 --> 12:06:26,692 It's, ultimately, pointing\nat the whole node itself. 12490 12:06:26,692 --> 12:06:30,802 And the fact that I, then, use\nTMP arrow number means, OK 12491 12:06:30,802 --> 12:06:31,972 adjust your finger slightly. 12492 12:06:31,972 --> 12:06:35,991 So you're literally pointing at the\n 12493 12:06:35,991 --> 12:06:40,402 So that's sufficient information for\n 12494 12:06:41,042 --> 12:06:44,211 Other questions then\non this approach here. 12495 12:06:46,762 --> 12:06:51,322 SPEAKER 1: How would I use a for-loop\n 12496 12:06:51,322 --> 12:06:56,122 You will do something like this,\nif I may, in problem set 5. 12497 12:06:56,122 --> 12:06:59,211 We will give you some of the\nscaffolding for doing this. 12498 12:06:59,211 --> 12:07:02,182 But in this coming weeks materials\nwill we guide you to that. 12499 12:07:02,182 --> 12:07:04,775 But let me not spoil it just yet. 12500 12:07:06,192 --> 12:07:08,559 AUDIENCE: So I had a\nquestion about line 49. 12501 12:07:09,141 --> 12:07:11,159 AUDIENCE: Is line 49\npossible in line 43? 12502 12:07:12,201 --> 12:07:15,381 Is line 49 acceptable, even\nif we freed it earlier. 12503 12:07:15,381 --> 12:07:18,081 We didn't free it in line\n43, in this case, right. 12504 12:07:18,082 --> 12:07:22,281 You can only reach line 49,\nif n does not equal null. 12505 12:07:22,281 --> 12:07:24,472 And you do not return on line 45. 12506 12:07:25,341 --> 12:07:29,661 I was only doing those freeing, if I\n 12507 12:07:32,512 --> 12:07:33,887 AUDIENCE: I had a quick question. 12508 12:07:36,862 --> 12:07:40,131 SPEAKER 1: Correct You're asking\n 12509 12:07:40,131 --> 12:07:41,839 does that mean you\ndon't have to free it? 12510 12:07:41,839 --> 12:07:44,241 You never have to free pointers, per se. 12511 12:07:44,241 --> 12:07:49,042 You should only free addresses that\n 12512 12:07:49,042 --> 12:07:51,412 So I haven't finished\nthe program, to be fair. 12513 12:07:51,411 --> 12:07:53,361 But you're not freeing variables. 12514 12:07:53,362 --> 12:07:55,222 You're not freeing like, fields. 12515 12:07:55,222 --> 12:07:58,351 You are freeing specific\naddresses, whatever they may be. 12516 12:07:58,351 --> 12:08:01,252 So the last thing, and I\nwas stalling on showing this 12517 12:08:01,252 --> 12:08:02,932 because it too is a little cryptic. 12518 12:08:02,932 --> 12:08:06,052 Here is how you can free,\nnow, a whole linked list. 12519 12:08:06,052 --> 12:08:08,724 In the world of arrays,\nrecall, it was so easy. 12520 12:08:09,682 --> 12:08:11,402 You return 0 and you're done. 12521 12:08:12,622 --> 12:08:14,482 Because, again, the\ncomputer doesn't know 12522 12:08:14,482 --> 12:08:17,182 what you have stitched together\nusing all of these pointers 12523 12:08:17,182 --> 12:08:18,622 all over the computer's memory. 12524 12:08:18,622 --> 12:08:20,662 You need to follow those arrows. 12525 12:08:20,661 --> 12:08:23,401 So one way to do this\nwould be as follows. 12526 12:08:23,402 --> 12:08:28,402 While the list itself is not null,\n 12527 12:08:29,722 --> 12:08:32,454 I'm going to give myself a\ntemporary variable called TMP again. 12528 12:08:32,453 --> 12:08:34,911 And it's a different TMP because\nit's in a different scope. 12529 12:08:34,911 --> 12:08:38,691 It's inside of the while loop instead\n 12530 12:08:38,692 --> 12:08:44,122 I am going to initialize TMP to\nbe the address of the next node. 12531 12:08:44,122 --> 12:08:46,641 Just so I can get one\nstep ahead of things. 12532 12:08:47,932 --> 12:08:51,811 Because now, I can boldly\nfree the list itself 12533 12:08:51,811 --> 12:08:53,451 which does not mean the whole list. 12534 12:08:53,451 --> 12:08:56,151 Again, I'm freeing the\naddress in list, which 12535 12:08:56,152 --> 12:08:58,891 is the address of the number 1 node. 12536 12:08:59,872 --> 12:09:02,461 It's just the address\nof the number 1 node. 12537 12:09:02,461 --> 12:09:05,362 So if I first use TMP\nto point out the number 12538 12:09:05,362 --> 12:09:10,792 2 slightly in the middle of the picture,\n 12539 12:09:10,792 --> 12:09:12,772 at the moment, to free list. 12540 12:09:12,771 --> 12:09:15,351 That is the address of the first node. 12541 12:09:15,351 --> 12:09:19,641 Now I'm going to say, all right, once\n 12542 12:09:19,641 --> 12:09:24,561 I can update the list\nitself to be literally TMP. 12543 12:09:27,932 --> 12:09:33,622 If you think about this picture, TMP\n 12544 12:09:35,031 --> 12:09:38,421 So TMP, represented by my right hand\n 12545 12:09:38,421 --> 12:09:43,011 Totally safe and reasonable to\nfree now the list itself a.k.a. 12546 12:09:43,012 --> 12:09:44,632 the address of the number 1 node. 12547 12:09:44,631 --> 12:09:47,362 That has the effect of just\nthrowing away the number 1 node 12548 12:09:47,362 --> 12:09:50,152 telling the computer you can\nreuse that memory for you. 12549 12:09:50,152 --> 12:09:53,632 The last line of code I wrote\n 12550 12:09:53,631 --> 12:09:58,042 2, at which point my loop proceeded\n 12551 12:09:58,042 --> 12:10:01,072 And only once my finger is\nliterally pointing at nowhere 12552 12:10:01,072 --> 12:10:03,832 the null symbol, will the\nloop, by nature of a while 12553 12:10:03,832 --> 12:10:06,472 loop as I'll toggle back to, break out. 12554 12:10:06,472 --> 12:10:09,112 And there's nothing more to be freed. 12555 12:10:09,112 --> 12:10:12,171 So again, what you'll see,\nultimately, in problem set 5 12556 12:10:12,171 --> 12:10:16,171 more on that later, is an opportunity\n 12557 12:10:17,211 --> 12:10:20,061 But again, even though the syntax\nis admittedly pretty cryptic 12558 12:10:20,061 --> 12:10:23,781 we're still using basics like\nthese for-loops or while loops. 12559 12:10:23,781 --> 12:10:27,442 We're just starting to now\nfollow explicit addresses rather 12560 12:10:27,442 --> 12:10:31,222 than letting the computer do\nall of the arithmetic for us 12561 12:10:31,222 --> 12:10:33,116 as we previously benefited from. 12562 12:10:33,116 --> 12:10:36,241 At the very end of this thing, I'm\n 12563 12:10:36,241 --> 12:10:39,722 And I think, then, we're good to go. 12564 12:10:40,222 --> 12:10:43,442 Questions on this linked list code now? 12565 12:10:43,442 --> 12:10:46,192 And again, we'll walk through this\n 12566 12:10:46,692 --> 12:10:51,095 AUDIENCE: Can you explain the while\n 12567 12:10:51,762 --> 12:10:55,432 Can we explain this while loop\nhere for freeing the list. 12568 12:10:55,432 --> 12:10:58,061 So notice that, first, I'm just\nasking the obvious question. 12569 12:10:58,902 --> 12:11:02,872 Because if it is, there's\nno work to be done. 12570 12:11:02,872 --> 12:11:06,942 However, while the list is not\nnull, according to line 58 12571 12:11:08,021 --> 12:11:12,401 I want to create a temporary variable\n 12572 12:11:12,402 --> 12:11:15,022 that list arrow next is pointing at. 12573 12:11:17,741 --> 12:11:21,171 List arrow next is whatever\nthis thing is here. 12574 12:11:21,171 --> 12:11:23,951 So if my right hand represents\nthe temporary variable 12575 12:11:23,951 --> 12:11:27,951 I'm literally pointing at the\nsame thing as the list is itself. 12576 12:11:27,951 --> 12:11:31,121 The next line of code,\nrecall, was free the list. 12577 12:11:31,122 --> 12:11:33,882 And unlike, in our world of\narrays, like half an hour 12578 12:11:33,881 --> 12:11:36,581 ago where that just meant\nfree the whole darn list 12579 12:11:36,582 --> 12:11:41,171 you now have taken over control over the\n 12580 12:11:41,171 --> 12:11:43,031 in ways that you didn't with the array. 12581 12:11:43,031 --> 12:11:46,332 The computer knew how to free\nthe whole array because you 12582 12:11:46,332 --> 12:11:48,162 malloc the whole thing at once. 12583 12:11:48,161 --> 12:11:52,061 You are now mallocing the\nlinked list one node at a time. 12584 12:11:52,061 --> 12:11:54,911 And the operating system does\nnot keep track of for you 12585 12:11:56,292 --> 12:11:59,952 So when you free list,\nyou are literally freeing 12586 12:11:59,951 --> 12:12:03,911 the value of the list variable,\n 12587 12:12:03,911 --> 12:12:07,301 Then my last line of code, which I'll\n 12588 12:12:07,302 --> 12:12:11,982 list to now ignore the\nfree memory and point at 2. 12589 12:12:14,561 --> 12:12:17,981 So, again, it's just a\nvery pedantic way of using 12590 12:12:17,982 --> 12:12:21,942 this new syntax of star notation,\n 12591 12:12:21,942 --> 12:12:25,902 to do the equivalent of walking\ndown all of these arrows. 12592 12:12:25,902 --> 12:12:28,122 Following all of these breadcrumbs. 12593 12:12:28,122 --> 12:12:31,422 But it does take admittedly\nsome getting used to. 12594 12:12:31,421 --> 12:12:33,926 Syntax, you only have to do one week. 12595 12:12:33,927 --> 12:12:35,802 But, again, next week\nin Python will we begin 12596 12:12:35,802 --> 12:12:37,632 to abstract a lot of\nthis complexity away. 12597 12:12:37,631 --> 12:12:39,502 But none of this\ncomplexity is going away. 12598 12:12:39,502 --> 12:12:42,252 It's just that someone else, the\nauthors of Python for instance 12599 12:12:42,252 --> 12:12:44,389 will have automated this stuff for us. 12600 12:12:44,389 --> 12:12:46,182 The goal this week is\nto understand what it 12601 12:12:46,182 --> 12:12:49,461 is we're going to get for\nfree, so to speak, next week. 12602 12:12:49,961 --> 12:12:54,292 Questions on these length lists. 12603 12:12:55,932 --> 12:12:58,745 AUDIENCE: So are the while\nloops strictly necessary 12604 12:12:58,745 --> 12:13:00,209 for the freeing [INAUDIBLE]. 12605 12:13:01,252 --> 12:13:03,834 Let me summarize as, could we\nhave freed this with a for-loop? 12606 12:13:04,760 --> 12:13:06,112 It just is a matter of style. 12607 12:13:06,112 --> 12:13:09,152 It's a little more elegant to do it\n 12608 12:13:09,152 --> 12:13:11,154 But other people will\nreasonably disagree. 12609 12:13:11,154 --> 12:13:13,862 Anything you can do with a while\n 12610 12:13:14,872 --> 12:13:17,211 Do while loops, recall,\nare a little different. 12611 12:13:17,211 --> 12:13:19,853 But they will always\ndo at least one thing. 12612 12:13:19,853 --> 12:13:22,311 But for-loops and while loops\nbehave the same in this case. 12613 12:13:25,482 --> 12:13:27,881 All right, well let's just\nvary things a little bit here. 12614 12:13:27,881 --> 12:13:29,963 Just to see what some of\nthe pitfalls might now be 12615 12:13:29,963 --> 12:13:31,722 without getting into the weeds of code. 12616 12:13:31,722 --> 12:13:35,711 Indeed, we'll try to save some of\n 12617 12:13:35,711 --> 12:13:40,002 But instead, let's imagine that we\n 12618 12:13:40,002 --> 12:13:43,182 I can offer, in exchange for a\nfew volunteers, some foam fingers 12619 12:13:43,182 --> 12:13:45,099 to bring to the next game, perhaps. 12620 12:13:45,099 --> 12:13:46,932 Could we get maybe just\none volunteer first? 12621 12:13:47,591 --> 12:13:50,591 You will be our linked\nlist from the get go. 12622 12:13:52,061 --> 12:13:54,322 SPEAKER 1: Pedro, come on up. 12623 12:13:54,322 --> 12:13:55,572 All right, thank you to Pedro. 12624 12:13:58,661 --> 12:14:00,661 And if you want to just\nstand roughly over here. 12625 12:14:00,661 --> 12:14:03,210 But you are a null pointer so\njust point sort of at the ground 12626 12:14:03,211 --> 12:14:04,412 as though you're pointing at 0. 12627 12:14:04,911 --> 12:14:07,508 So Pedro is our linked list\nof size 0, which pictorially 12628 12:14:07,508 --> 12:14:10,800 might look a little something like this\n 12629 12:14:10,800 --> 12:14:15,481 Now suppose that we want to go ahead\n 12630 12:14:15,482 --> 12:14:17,682 Can we get a volunteer\nto be on camera here? 12631 12:14:18,182 --> 12:14:19,349 You jumped out of your seat. 12632 12:14:21,889 --> 12:14:23,682 OK, you really want\nthe foam finger, I say. 12633 12:14:36,752 --> 12:14:39,271 So here is your number\n2 for your number field. 12634 12:14:40,502 --> 12:14:43,597 And come on, let's say that there\n 12635 12:14:44,222 --> 12:14:46,961 So Caleb got malloced,\nif you will, over here. 12636 12:14:46,961 --> 12:14:51,286 So now if we want to insert Caleb and\n 12637 12:14:51,286 --> 12:14:52,411 well what do we need to do? 12638 12:14:52,411 --> 12:14:53,822 I already initialized you to 2. 12639 12:14:53,822 --> 12:14:55,802 And pointing as you\nare to the ground means 12640 12:14:55,802 --> 12:14:58,112 you're initialized to\nnull for your next field. 12641 12:14:58,112 --> 12:14:59,881 Pedro, what you should you-- perfect. 12642 12:15:02,101 --> 12:15:03,676 So Pedro is now pointing at the list. 12643 12:15:03,677 --> 12:15:05,802 So now our list looks a\nlittle something like this. 12644 12:15:07,652 --> 12:15:10,152 So the first couple of these\nwill be pretty straightforward. 12645 12:15:10,152 --> 12:15:13,662 Let's insert one more, if anyone\n 12646 12:15:13,661 --> 12:15:15,161 Here, how about right in the middle. 12647 12:15:16,351 --> 12:15:19,159 And just in anticipation, how\nabout let's malloc someone else. 12648 12:15:19,160 --> 12:15:20,702 OK, your friends are pointing at you. 12649 12:15:20,701 --> 12:15:22,831 Do you want to come\ndown too, preemptively? 12650 12:15:22,832 --> 12:15:25,334 This is a pool of memory, if you will. 12651 12:15:30,661 --> 12:15:32,291 And hang there for just a moment. 12652 12:15:32,792 --> 12:15:34,351 So we've just malloced Hannah. 12653 12:15:34,351 --> 12:15:37,621 And Hannah, how about Hannah,\nsuppose you ended up over there 12654 12:15:37,622 --> 12:15:39,282 in just some random location. 12655 12:15:39,781 --> 12:15:43,442 So what should we now do, if the\n 12656 12:15:44,042 --> 12:15:46,020 So Pedro, do you have\nto update yourself? 12657 12:15:47,391 --> 12:15:48,781 Caleb, what do you have to do? 12658 12:15:49,281 --> 12:15:52,173 And Hannah what should you be doing? 12659 12:15:52,173 --> 12:15:55,381 I would, it's just for you for now, so\n 12660 12:15:55,881 --> 12:15:58,771 So, again demonstrating the fact\n 12661 12:15:58,771 --> 12:16:01,291 we had our nice, clean array\nback, to back, to back 12662 12:16:01,292 --> 12:16:03,862 contiguously, these guys are\ndeliberately all over the stage. 12663 12:16:10,921 --> 12:16:12,737 And pick your favorite place in memory. 12664 12:16:16,802 --> 12:16:19,030 So Jonathan's now over there. 12665 12:16:20,072 --> 12:16:21,929 So 5, we want to point\nHannah at number 5. 12666 12:16:21,928 --> 12:16:23,761 So you, of course, are\ngoing to point there. 12667 12:16:23,762 --> 12:16:25,137 And where should you be pointing? 12668 12:16:25,137 --> 12:16:26,982 Down to represent null, as well. 12669 12:16:29,035 --> 12:16:30,702 But now things get a little interesting. 12670 12:16:30,701 --> 12:16:33,481 And here, we'll use a chance\nto, without the weeds of code 12671 12:16:33,482 --> 12:16:36,572 point out how order of operations\nis really going to matter. 12672 12:16:36,572 --> 12:16:40,802 Suppose that I next want to\nallocate say, the number 1. 12673 12:16:40,802 --> 12:16:42,992 And I want to insert the\nnumber 1 into this list. 12674 12:16:43,491 --> 12:16:45,101 This is what the code would look like. 12675 12:16:45,101 --> 12:16:48,661 But if we act this out-- could\nwe get one more volunteer? 12676 12:16:48,661 --> 12:16:50,471 How about on the end\nthere in the sweater. 12677 12:17:01,457 --> 12:17:03,332 And how about, Lauren,\nwhy don't you go right 12678 12:17:03,332 --> 12:17:04,952 in here in front, if you don't mind. 12679 12:17:07,262 --> 12:17:09,332 So I've initialized\nLauren to the number 1. 12680 12:17:09,332 --> 12:17:11,942 And your pointer will be\nnull, pointing at the ground. 12681 12:17:11,942 --> 12:17:14,485 Where do you belong if we're\nmaintaining sorted order? 12682 12:17:14,485 --> 12:17:15,902 Looks like right at the beginning. 12683 12:17:18,902 --> 12:17:23,582 So Pedro has presumed\nto point now at Lauren. 12684 12:17:23,582 --> 12:17:27,812 But how do you know where to point? 12685 12:17:28,982 --> 12:17:30,882 SPEAKER 1: Pedro's undoing\nwhat he did a moment ago. 12686 12:17:31,862 --> 12:17:35,232 And that was perfect that Pedro\n 12687 12:17:35,732 --> 12:17:39,432 You literally just orphaned all of these\n 12688 12:17:39,932 --> 12:17:44,281 Because if Pedro was our only variable\n 12689 12:17:44,281 --> 12:17:47,281 this is the danger of using pointers,\n 12690 12:17:47,281 --> 12:17:48,661 and building your own data structures. 12691 12:17:48,661 --> 12:17:50,619 The moment you point\ntemporarily, if you could 12692 12:17:50,620 --> 12:17:53,972 to Lauren, I have no idea\nwhere he's pointing to. 12693 12:17:53,972 --> 12:17:58,741 I have no idea how to get back to Caleb,\n 12694 12:18:01,771 --> 12:18:03,781 I think we need Lauren\nto make a decision first. 12695 12:18:05,131 --> 12:18:06,301 SPEAKER 1: So pointing at Caleb. 12696 12:18:06,601 --> 12:18:09,184 Because you're pointing at\nliterally who Pedro is pointing at. 12697 12:18:09,184 --> 12:18:10,971 Pedro, now what are you safe to do? 12698 12:18:11,472 --> 12:18:13,211 So order of operations there matters. 12699 12:18:13,211 --> 12:18:17,311 And if we had just done this line\n 12700 12:18:17,311 --> 12:18:20,222 That was like Pedro's first\ninstinct, bad things happen. 12701 12:18:20,222 --> 12:18:22,182 And we orphaned the rest of the list. 12702 12:18:22,182 --> 12:18:25,832 But if we think through it logically and\n 12703 12:18:25,832 --> 12:18:29,322 we've now updated the list to look\n 12704 12:18:30,391 --> 12:18:32,966 We got one more foam finger\nhere for the number 3. 12705 12:18:49,216 --> 12:18:52,851 If you want to go maybe in the middle of\n 12706 12:18:52,851 --> 12:18:56,752 So here, too, the goal is\nto maintain sorted order. 12707 12:18:56,752 --> 12:19:01,881 So let's ask the audience, who or what\n 12708 12:19:01,881 --> 12:19:04,391 So we don't screw up and\norphan some of the memory. 12709 12:19:04,391 --> 12:19:07,722 And if we do orphan memory, this is\n 12710 12:19:08,591 --> 12:19:10,901 Your Mac, your PC, your\nphone can start to slow down 12711 12:19:10,902 --> 12:19:14,092 if you keep asking for memory but\n 12712 12:19:14,091 --> 12:19:15,911 So we want to get this right. 12713 12:19:20,502 --> 12:19:22,182 SPEAKER 1: 3 should point at 4. 12714 12:19:22,182 --> 12:19:25,572 So 3, do you want to point at 4. 12715 12:19:27,281 --> 12:19:32,442 And how did you know,\nMiriam, whom to point at? 12716 12:19:36,131 --> 12:19:39,701 Because if you look at where this\nlist is currently constructed 12717 12:19:39,701 --> 12:19:42,551 and you can cheat on the board\nhere, 2 is pointing to 4. 12718 12:19:42,552 --> 12:19:46,122 If you point at whoever Caleb,\nnumber 2, is pointing out 12719 12:19:46,122 --> 12:19:48,942 that, indeed, leads you\nto Hannah for number 4. 12720 12:19:48,942 --> 12:19:53,082 So now what's the next step\nto stitch this together? 12721 12:19:57,792 --> 12:20:00,385 So Caleb, I think it's now\nsafe for you to decouple. 12722 12:20:00,385 --> 12:20:02,302 Because someone is already\npointing at Hannah. 12723 12:20:02,302 --> 12:20:03,427 We haven't orphaned anyone. 12724 12:20:03,427 --> 12:20:05,322 So now, if we follow\nthe breadcrumbs, we've 12725 12:20:05,322 --> 12:20:10,351 got Pedro leading to 1,\nto 2, to 3, to 4, to 5. 12726 12:20:10,351 --> 12:20:12,851 We need the numbers back, but\nyou can keep the foam fingers. 12727 12:20:12,851 --> 12:20:15,019 Thank you to our volunteers here. 12728 12:20:17,741 --> 12:20:20,739 SPEAKER 1: You can just\nput the numbers here. 12729 12:20:21,572 --> 12:20:22,739 SPEAKER 1: Thank you to all. 12730 12:20:22,739 --> 12:20:26,682 So this is only to say that when you\n 12731 12:20:26,682 --> 12:20:29,245 and in the problem set, it's\ngoing to be very easy to lose 12732 12:20:29,245 --> 12:20:30,662 sight of the forest for the trees. 12733 12:20:30,661 --> 12:20:32,701 Because the code does get really dense. 12734 12:20:32,701 --> 12:20:37,721 But the idea is, again, really do bubble\n 12735 12:20:37,722 --> 12:20:40,781 And if you think about data\nstructures at this level. 12736 12:20:40,781 --> 12:20:42,898 If you go off in program\nafter a class like CS50 12737 12:20:42,898 --> 12:20:45,481 and your whiteboarding something\nwith a friend or a colleague 12738 12:20:45,482 --> 12:20:48,512 most people think at\nand talk at this level. 12739 12:20:48,512 --> 12:20:51,031 And they just assume that,\nyeah, if we went back and looked 12740 12:20:51,031 --> 12:20:54,371 at our textbooks or class notes, we\n 12741 12:20:54,372 --> 12:20:56,222 But the important stuff\nis the conversation. 12742 12:20:57,601 --> 12:21:02,561 Even though, via this week, will we\n 12743 12:21:02,561 --> 12:21:06,572 So when it comes to analyzing\nan algorithm like this 12744 12:21:06,572 --> 12:21:08,641 let's consider the following. 12745 12:21:08,641 --> 12:21:15,961 What might be now the running time of\n 12746 12:21:17,582 --> 12:21:19,292 We talked about arrays earlier. 12747 12:21:19,292 --> 12:21:22,292 And we had some binary search\npossibilities still, as soon 12748 12:21:23,131 --> 12:21:26,311 But as soon as we have a linked list,\n 12749 12:21:27,661 --> 12:21:29,369 And so you can't just\nassume that you can 12750 12:21:29,370 --> 12:21:32,162 jump arithmetically to the middle\n 12751 12:21:32,982 --> 12:21:36,572 You pretty much have to follow all\n 12752 12:21:36,572 --> 12:21:39,362 So how might that inform what we see? 12753 12:21:41,076 --> 12:21:43,951 Even though I keep drawing all these\n 12754 12:21:44,461 --> 12:21:46,254 And all of us humans\nin the room can easily 12755 12:21:46,254 --> 12:21:49,842 spot where the 1 is, where the 2 is,\n 12756 12:21:49,841 --> 12:21:54,091 just like with our lockers and arrays,\n 12757 12:21:54,091 --> 12:21:57,991 And the key thing with a linked\nlist is that the only address 12758 12:21:57,991 --> 12:22:01,891 we've fundamentally been remembering\n 12759 12:22:01,891 --> 12:22:05,472 He was the link to all\nof the other nodes. 12760 12:22:05,472 --> 12:22:07,472 And, in turn, each\nperson led to the next. 12761 12:22:07,472 --> 12:22:12,132 But without Pedro, we would have lost\n 12762 12:22:12,131 --> 12:22:14,432 So when you start with\na linked list, if you 12763 12:22:14,432 --> 12:22:18,211 want to find an element as via\n 12764 12:22:18,211 --> 12:22:19,682 Following all of the arrows. 12765 12:22:19,682 --> 12:22:21,692 Following all of the\npointers on the stage 12766 12:22:21,692 --> 12:22:23,822 in order to get to the node in question. 12767 12:22:23,822 --> 12:22:27,182 And only once you hit null can\nyou conclude, yep, it was there. 12768 12:22:28,982 --> 12:22:31,922 So given that if a\ncomputer, essentially 12769 12:22:31,921 --> 12:22:36,451 can only see the number 1, or the number\n 12770 12:22:36,451 --> 12:22:39,752 or the number 5, one\nat a time, how might we 12771 12:22:39,752 --> 12:22:43,171 think about the running time of search? 12772 12:22:43,171 --> 12:22:45,091 And it is indeed Big O of n. 12773 12:22:45,891 --> 12:22:48,391 Well, in the worst case, the\nnumber you might be looking for 12774 12:22:49,961 --> 12:22:53,192 And so, obviously, you're going to\n 12775 12:22:53,192 --> 12:22:55,425 And I drew these things\nwith boxes on top of them. 12776 12:22:55,425 --> 12:22:57,842 Because, again, even though\nyou and I can immediately see 12777 12:22:57,841 --> 12:23:00,091 where the 5 is for\ninstance, the computer 12778 12:23:00,091 --> 12:23:03,961 can only figure that out by starting\n 12779 12:23:03,961 --> 12:23:05,881 So there, too, is another trade off. 12780 12:23:05,881 --> 12:23:09,511 It would seem that, overnight,\nwe have lost the ability 12781 12:23:09,512 --> 12:23:14,672 to do a very powerful algorithm from\n 12782 12:23:15,302 --> 12:23:19,292 Because there's no way in this\npicture to jump mathematically 12783 12:23:19,292 --> 12:23:21,857 to the middle node, unless\nyou remember where it is. 12784 12:23:21,857 --> 12:23:23,732 And then, remember where\nevery other node is. 12785 12:23:23,732 --> 12:23:25,524 And at that point,\nyou're back to an array. 12786 12:23:25,523 --> 12:23:29,862 Linked list, by design, only\nremember the next node in the list. 12787 12:23:30,362 --> 12:23:32,851 How about something like insert? 12788 12:23:32,851 --> 12:23:35,671 In the worst case,\nperhaps, how many steps 12789 12:23:35,671 --> 12:23:38,822 might it take to insert\nsomething into a linked list? 12790 12:23:44,372 --> 12:23:45,713 Fortunately, it's not that bad. 12791 12:23:45,713 --> 12:23:46,921 It's not as bad as n squared. 12792 12:23:46,921 --> 12:23:49,201 That typically means\ndoing n things, n times. 12793 12:23:49,201 --> 12:23:53,741 And I think we can stay under\nthat, but not a bad thought. 12794 12:23:55,313 --> 12:23:56,521 SPEAKER 1: Why would it be n? 12795 12:23:56,521 --> 12:24:00,269 AUDIENCE: Because the [INAUDIBLE]. 12796 12:24:00,851 --> 12:24:03,131 So to summarize, you're proposing n. 12797 12:24:03,131 --> 12:24:04,994 Because to find where\nthe thing goes, you 12798 12:24:04,995 --> 12:24:06,912 have to traverse,\npotentially, the whole list. 12799 12:24:06,911 --> 12:24:09,701 Because if I'm inserting the\nnumber 6 or the number 99 12800 12:24:09,701 --> 12:24:12,252 that numerically\nbelongs at the very end 12801 12:24:12,252 --> 12:24:15,311 I can only find its location\nby looking for all of them. 12802 12:24:15,311 --> 12:24:16,849 At this point, though, in the term. 12803 12:24:16,849 --> 12:24:18,641 And really, at this\npoint in the story, you 12804 12:24:18,641 --> 12:24:22,072 should start to question these very\n 12805 12:24:22,072 --> 12:24:25,841 Because the answer is almost\nalways going to depend, right. 12806 12:24:25,841 --> 12:24:28,461 If I've just got a link to\nlist that looks like this 12807 12:24:28,461 --> 12:24:31,722 the first question back to\nsomeone asking this question 12808 12:24:31,722 --> 12:24:34,781 would be, well does the list\nneed to be sorted, right? 12809 12:24:34,781 --> 12:24:37,173 I've drawn it as sorted\nand it might imply as much. 12810 12:24:37,173 --> 12:24:39,131 So that's a reasonable\nassumption to have made. 12811 12:24:39,131 --> 12:24:41,801 But if I don't care about\nmaintaining sorted order 12812 12:24:41,802 --> 12:24:45,672 I could actually insert into a\nlinked list in constant time. 12813 12:24:46,211 --> 12:24:49,110 I could just keep inserting into\n 12814 12:24:49,902 --> 12:24:51,792 And even though the\nlist is getting longer 12815 12:24:51,792 --> 12:24:55,752 the number of steps required to insert\n 12816 12:25:00,222 --> 12:25:02,382 If you want to keep it\nsorted though, yes, it's 12817 12:25:02,381 --> 12:25:03,792 going to be, indeed, Big O of n. 12818 12:25:03,792 --> 12:25:05,322 But again, these kinds\nof, now, assumptions 12819 12:25:05,322 --> 12:25:06,529 are going to start to matter. 12820 12:25:06,529 --> 12:25:09,222 So let's for the sake of\ndiscussion say it's Big O of n 12821 12:25:09,222 --> 12:25:11,141 if we do want to maintain sorted order. 12822 12:25:11,141 --> 12:25:14,292 But what about in the\ncase of not caring. 12823 12:25:14,292 --> 12:25:16,110 It might indeed be a Big O of 1. 12824 12:25:16,110 --> 12:25:19,152 And now these are the kinds of decisions\n 12825 12:25:19,152 --> 12:25:20,682 What about in the best case here? 12826 12:25:20,682 --> 12:25:22,722 If we're thinking about\nBig Omega notation 12827 12:25:22,722 --> 12:25:25,114 then, frankly, we could just\nget lucky in the best case. 12828 12:25:25,114 --> 12:25:27,822 And the element we're looking for\n 12829 12:25:27,822 --> 12:25:32,052 Or heck, we just blindly insert to the\n 12830 12:25:32,052 --> 12:25:33,982 that we want to keep things in. 12831 12:25:34,482 --> 12:25:39,900 So besides then, how can we\nimprove further on this design? 12832 12:25:39,900 --> 12:25:41,442 We don't need to stop at linked list. 12833 12:25:41,442 --> 12:25:43,572 Because, honestly, it's\nnot been a clear win. 12834 12:25:43,572 --> 12:25:46,421 Like, linked list allow us\nto use more of our memory 12835 12:25:46,421 --> 12:25:49,911 because we don't need massive\n 12836 12:25:50,781 --> 12:25:54,792 But they still require Big O of\nn time to find the end of it 12837 12:25:56,112 --> 12:25:59,351 We're using at least twice as\nmuch memory for the darn pointer. 12838 12:25:59,351 --> 12:26:01,601 So that seems like a sidestep. 12839 12:26:01,601 --> 12:26:03,582 It's not really a step forward. 12840 12:26:05,322 --> 12:26:09,639 Here's where we can now accelerate the\n 12841 12:26:09,639 --> 12:26:11,472 even if you haven't\nused this technique yet 12842 12:26:11,472 --> 12:26:15,612 we would seem to have an ability to\n 12843 12:26:16,601 --> 12:26:19,002 And anything you could\nimagine drawing with arrows 12844 12:26:19,002 --> 12:26:21,622 you can implement, it\nwould seem, in code. 12845 12:26:21,622 --> 12:26:24,102 So what if we leverage\na second dimension. 12846 12:26:24,101 --> 12:26:26,618 Instead of just stringing\ntogether things laterally 12847 12:26:26,618 --> 12:26:28,451 left to right, essentially,\neven though they 12848 12:26:28,451 --> 12:26:30,101 were bouncing around on the screen. 12849 12:26:30,101 --> 12:26:33,252 What if we start to leverage a\n 12850 12:26:33,252 --> 12:26:36,881 And build more interesting\nstructures in the computer's memory. 12851 12:26:36,881 --> 12:26:39,671 Well it turns out that\nin a computer's memory 12852 12:26:39,671 --> 12:26:42,612 we could create a tree,\nsimilar to a family tree. 12853 12:26:42,612 --> 12:26:46,362 If you've ever seen or draw on a family\n 12854 12:26:50,442 --> 12:26:53,652 So inverted branch of a\ntree that grows, typically 12855 12:26:53,652 --> 12:26:56,531 when it's drawn, downward instead\nof upward like a typical tree. 12856 12:26:56,531 --> 12:26:59,021 But that's something we could\ntranslate into code as well. 12857 12:26:59,021 --> 12:27:02,721 Specifically, let's do something\ncalled a binary search tree. 12858 12:27:04,601 --> 12:27:07,152 And what I mean by\nthis is the following. 12859 12:27:07,961 --> 12:27:10,841 This is an example of an\narray from like week 2 12860 12:27:10,841 --> 12:27:12,231 when we first talked about those. 12861 12:27:12,232 --> 12:27:13,932 And we had the lockers on stage. 12862 12:27:13,932 --> 12:27:19,961 And recall that what was nice\nabout an array, if 1, it's sorted. 12863 12:27:19,961 --> 12:27:23,021 And 2, all of its numbers\nare indeed contiguous 12864 12:27:23,021 --> 12:27:25,011 which is by definition an array. 12865 12:27:25,012 --> 12:27:26,752 We can just do some simple math. 12866 12:27:26,752 --> 12:27:31,461 For instance, if there are 7 elements\n 12867 12:27:31,961 --> 12:27:34,811 3 and 1/2, round down\nthrough truncation, that's 3. 12868 12:27:36,161 --> 12:27:39,414 That gives me the middle element,\narithmetically, in this thing. 12869 12:27:39,415 --> 12:27:41,832 And even though I have to be\ncareful about rounding, using 12870 12:27:41,832 --> 12:27:45,912 simple arithmetic, I can very quickly,\n 12871 12:27:45,911 --> 12:27:48,371 find for you the middle of the\nleft half, of the left half 12872 12:27:48,372 --> 12:27:49,664 of the right half, or whatever. 12873 12:27:49,663 --> 12:27:50,961 That's the power of arrays. 12874 12:27:50,961 --> 12:27:52,902 And that's what gave us binary search. 12875 12:27:52,902 --> 12:27:54,421 And how did binary search work? 12876 12:27:54,421 --> 12:27:55,671 Well, we looked at the middle. 12877 12:27:55,671 --> 12:27:57,311 And then, we went left or right. 12878 12:27:57,311 --> 12:28:02,561 And then, we went left or right again,\n 12879 12:28:02,561 --> 12:28:07,691 Wouldn't it be nice if we\nsomehow preserved the new upsides 12880 12:28:07,692 --> 12:28:10,520 today of dynamic memory\nallocation, giving ourselves 12881 12:28:10,519 --> 12:28:13,061 the ability to just add another\nelement, add another element 12882 12:28:14,232 --> 12:28:16,782 But retain the power of binary search. 12883 12:28:16,781 --> 12:28:21,582 Because log of n was much better than\n 12884 12:28:21,582 --> 12:28:24,461 Even the phone book\ndemonstrated as much weeks ago. 12885 12:28:24,461 --> 12:28:28,491 So what if I draw this same\npicture in 2 dimensions. 12886 12:28:28,491 --> 12:28:32,441 And I preserve the color scheme,\n 12887 12:28:32,442 --> 12:28:35,982 What are these things look like now? 12888 12:28:35,982 --> 12:28:38,532 Maybe, like, things we\nmight now call nodes, right. 12889 12:28:38,531 --> 12:28:42,512 A node is just a generic term\nfor like, storing some data. 12890 12:28:42,512 --> 12:28:45,682 What if the data these nodes\nare storing are numbers. 12891 12:28:47,211 --> 12:28:51,341 But what if we connected these\n 12892 12:28:51,341 --> 12:28:56,711 Whereby, every node has not one\npointer now, but as many as 2. 12893 12:28:56,711 --> 12:28:59,811 Maybe 0, like in the leaves\nat the bottom are in green. 12894 12:28:59,811 --> 12:29:02,932 But other nodes on the interior\nmight have as many as 2. 12895 12:29:02,932 --> 12:29:04,732 Like having 2 children, so to speak. 12896 12:29:04,732 --> 12:29:06,902 And indeed, the vernacular\nhere is exactly that. 12897 12:29:06,902 --> 12:29:08,812 This would be called\nthe root of the tree. 12898 12:29:08,811 --> 12:29:11,752 Or this would be a parent,\nwith respect to these children. 12899 12:29:11,752 --> 12:29:14,391 The green ones would be\ngrandchildren, respect to these. 12900 12:29:14,391 --> 12:29:19,012 The green ones would be siblings\nwith respect to each other. 12901 12:29:19,851 --> 12:29:22,144 So all the same jargon you\nmight use in the real world 12902 12:29:22,144 --> 12:29:25,402 applies in the world of data\nstructures and CS trees. 12903 12:29:25,402 --> 12:29:30,292 But this is interesting because I think\n 12904 12:29:30,292 --> 12:29:32,781 structure in the computer's memory. 12905 12:29:33,322 --> 12:29:37,521 Well, suppose that we defined\na node to be no longer just 12906 12:29:37,521 --> 12:29:39,591 this, a number in a next field. 12907 12:29:39,591 --> 12:29:42,351 What if we give ourselves\na bit more room here? 12908 12:29:42,351 --> 12:29:47,211 And give ourselves a pointer called\n 12909 12:29:47,211 --> 12:29:49,561 Both of which is a\npointer to a struct node. 12910 12:29:49,561 --> 12:29:53,511 So same idea as before, but now we\n 12911 12:29:53,512 --> 12:29:56,692 as pointing this way and\nthis way, not just this way. 12912 12:29:56,692 --> 12:29:58,762 Not just a single direction, but 2. 12913 12:29:58,762 --> 12:30:02,662 So you could imagine, in code, building\n 12914 12:30:02,661 --> 12:30:06,051 That creates, in essence,\nthis diagram here. 12915 12:30:07,732 --> 12:30:09,772 Suppose I want to find the number 3. 12916 12:30:09,771 --> 12:30:12,322 I want to search for the\nnumber 3 in this tree. 12917 12:30:12,322 --> 12:30:15,682 It would seem, just like Pedro was\n 12918 12:30:15,682 --> 12:30:18,572 in the world of trees,\nthe root, so to speak 12919 12:30:18,572 --> 12:30:20,572 is the beginning of your data structure. 12920 12:30:20,572 --> 12:30:26,211 You can retain and remember this entire\n 12921 12:30:26,752 --> 12:30:29,811 One variable can hang\non to this whole tree. 12922 12:30:29,811 --> 12:30:32,002 So how can I find the number 3? 12923 12:30:32,002 --> 12:30:36,141 Well, if I look at the root node and\n 12924 12:30:37,732 --> 12:30:40,052 Or if it's greater\nthan, I can go this way. 12925 12:30:40,052 --> 12:30:42,232 So I preserve that\nproperty of the phone book 12926 12:30:42,232 --> 12:30:44,482 or just assorted array in general. 12927 12:30:45,802 --> 12:30:48,810 If I'm looking for 3, I can\ngo to the right of the 2 12928 12:30:48,809 --> 12:30:50,601 because that number is\ngoing to be greater. 12929 12:30:50,601 --> 12:30:53,161 If I go left, it's going\nto be smaller instead. 12930 12:30:53,161 --> 12:30:55,911 And here's an example\nof actually recursion. 12931 12:30:55,911 --> 12:30:59,572 Recursion in a physical sense\nmuch like the Mario's pyramid. 12932 12:30:59,572 --> 12:31:01,732 Which was recursively to find. 12933 12:31:02,781 --> 12:31:04,731 I claim this whole thing is a tree. 12934 12:31:04,732 --> 12:31:08,272 Specifically, a binary search\ntree, which means every node 12935 12:31:08,271 --> 12:31:11,361 has 2, or maybe 1, or maybe 0 children. 12936 12:31:14,211 --> 12:31:19,641 And it's the case that every left\n 12937 12:31:19,641 --> 12:31:22,612 And every right child\nis larger than the root. 12938 12:31:22,612 --> 12:31:25,582 That definition certainly\nworks for 2, 4, and 6. 12939 12:31:25,582 --> 12:31:30,412 But it also works recursively for\n 12940 12:31:30,411 --> 12:31:32,391 Notice, if you think\nof this as the root 12941 12:31:32,391 --> 12:31:34,461 it is indeed bigger\nthan this left child. 12942 12:31:34,461 --> 12:31:36,561 And it's smaller than this right child. 12943 12:31:36,561 --> 12:31:39,081 And if you look even at\nthe leaves, so to speak. 12944 12:31:40,491 --> 12:31:44,169 This root node is bigger than\nits left child, if it existed. 12945 12:31:44,169 --> 12:31:45,502 So it's a meaningless statement. 12946 12:31:45,502 --> 12:31:47,692 And it's less than its right child. 12947 12:31:47,692 --> 12:31:50,482 Or it's not greater than, certainly,\nso that's meaningless too. 12948 12:31:50,482 --> 12:31:54,242 So we haven't violated the definition\n 12949 12:31:54,241 --> 12:31:57,711 And so, now, how many steps does\n 12950 12:31:57,711 --> 12:32:02,061 any number in a binary\nsearch tree, it would seem? 12951 12:32:04,012 --> 12:32:05,882 And the height of this\nthing is actually 3. 12952 12:32:05,881 --> 12:32:08,631 And so long story short, especially,\n 12953 12:32:08,631 --> 12:32:10,792 with your logarithms from yesteryear. 12954 12:32:10,792 --> 12:32:14,601 Log base 2 is the number of times you\n 12955 12:32:14,601 --> 12:32:16,341 and half, until you get down to 1. 12956 12:32:16,341 --> 12:32:19,309 This is like a logarithm\nin the reverse direction. 12957 12:32:19,309 --> 12:32:20,601 Here's a whole lot of elements. 12958 12:32:20,601 --> 12:32:22,972 And we're having, we're\nhaving until we get down to 1. 12959 12:32:22,972 --> 12:32:27,125 So the height of this tree, that\nis to say, is log base 2 of n. 12960 12:32:27,125 --> 12:32:30,292 Which means that even in the worst case,\n 12961 12:32:30,292 --> 12:32:32,167 it's all the way at the\nbottom in the leaves. 12962 12:32:32,811 --> 12:32:37,701 It's going to take log base 2\nof n steps, or log of n steps 12963 12:32:37,701 --> 12:32:41,311 to find, maximally, any\none of those numbers. 12964 12:32:41,311 --> 12:32:46,101 So, again, binary search is back. 12965 12:32:46,101 --> 12:32:48,116 But we've paid a price, right. 12966 12:32:48,116 --> 12:32:49,491 This isn't a linked list anymore. 12967 12:32:50,673 --> 12:32:53,631 But we've gained back binary search,\n 12968 12:32:53,631 --> 12:32:56,256 That's where the whole class\nbegan, on making that distinction. 12969 12:32:56,256 --> 12:33:01,502 But what price have we paid to retain\n 12970 12:33:04,552 --> 12:33:06,532 It's no longer sorted\nleft to right, but this 12971 12:33:06,531 --> 12:33:09,502 is a claim sorted, according to\n 12972 12:33:09,502 --> 12:33:13,491 Where, again, left child\nis smaller than root. 12973 12:33:13,491 --> 12:33:15,921 And right child is greater than root. 12974 12:33:15,921 --> 12:33:19,341 So it is sorted, but it's sorted in\n 12975 12:33:22,741 --> 12:33:24,152 AUDIENCE: [INAUDIBLE] nodes now. 12976 12:33:24,944 --> 12:33:29,312 Every node now needs not one\nnumber, but 2, 3 pieces of data. 12977 12:33:31,112 --> 12:33:32,866 So, again, there's that trade off again. 12978 12:33:32,866 --> 12:33:34,741 Where, well, if you want\nto save time, you've 12979 12:33:34,741 --> 12:33:37,561 got to give something if\nyou start giving space. 12980 12:33:37,561 --> 12:33:40,028 And you start using more\nspace, you can speed up time. 12981 12:33:40,862 --> 12:33:42,122 There's always a price paid. 12982 12:33:42,122 --> 12:33:47,882 And it's very often in space, or time,\n 12983 12:33:47,881 --> 12:33:49,511 the number of bugs you have to solve. 12984 12:33:49,512 --> 12:33:51,542 I mean, all of these\nare finite resources 12985 12:33:51,542 --> 12:33:53,315 that you have to juggle them on. 12986 12:33:53,315 --> 12:33:55,982 So if we consider now the code\nwith which we can implement this 12987 12:33:57,601 --> 12:34:00,551 And how might we actually\nuse something like this? 12988 12:34:00,552 --> 12:34:03,002 Well, let's take a look at,\nmaybe, one final program. 12989 12:34:03,002 --> 12:34:07,122 And see here, before we transition\n 12990 12:34:07,122 --> 12:34:11,552 Let me go ahead here and let me just\n 12991 12:34:11,552 --> 12:34:15,692 So let me, in a moment, copy\nover file called tree.c. 12992 12:34:15,692 --> 12:34:18,550 Which we'll have on\nthe course's websites. 12993 12:34:18,550 --> 12:34:20,342 And I'll walk you\nthrough some of the logic 12994 12:34:20,341 --> 12:34:25,271 here that I've written for tree.c. 12995 12:34:25,771 --> 12:34:27,281 So what do we have here first? 12996 12:34:27,281 --> 12:34:31,921 So here is an implementation of\n 12997 12:34:31,921 --> 12:34:36,341 And as before, I've played around and\n 12998 12:34:37,771 --> 12:34:41,611 Here is my definition of a node for a\n 12999 12:34:41,612 --> 12:34:44,491 from what I proposed on\nthe board a moment ago. 13000 12:34:44,491 --> 12:34:47,191 Here are 2 prototypes for\n2 functions, that I'll 13001 12:34:47,192 --> 12:34:49,262 show you in a moment,\nthat allow me to free 13002 12:34:49,262 --> 12:34:52,652 an entire tree, one node at a time. 13003 12:34:52,652 --> 12:34:55,382 And then, also allow me to\nprint the tree in order. 13004 12:34:55,381 --> 12:34:57,781 So even though they're\nnot sorted left to right 13005 12:34:57,781 --> 12:35:00,932 I bet if I'm clever about\nwhat child I print first 13006 12:35:00,932 --> 12:35:04,152 I can reconstruct the idea of\nprinting this tree properly. 13007 12:35:04,152 --> 12:35:06,632 So how might I implement\na binary search tree? 13008 12:35:07,921 --> 12:35:10,502 Here is how I might\nrepresent a tree of size 0. 13009 12:35:10,502 --> 12:35:13,442 It's just a null pointer called tree. 13010 12:35:13,442 --> 12:35:15,542 Here's how I might add\na number to that list. 13011 12:35:15,542 --> 12:35:19,561 So here, for instance, is me\nmalllocing space for a node. 13012 12:35:19,561 --> 12:35:21,691 Storing it in a temporary\nvariable called n. 13013 12:35:21,692 --> 12:35:23,552 Here is me just doing a safety check. 13014 12:35:23,552 --> 12:35:25,262 Make sure n does not equal null. 13015 12:35:25,262 --> 12:35:29,612 And then, here is me initializing this\n 13016 12:35:29,612 --> 12:35:32,342 Then, initializing the left\nchild of that node to be null. 13017 12:35:32,341 --> 12:35:34,991 And the right child of\nthat null node to be null. 13018 12:35:34,991 --> 12:35:40,152 And then, initializing the tree itself\n 13019 12:35:40,152 --> 12:35:43,322 So at this point in the story, there's\n 13020 12:35:43,322 --> 12:35:46,222 containing the number\n2 with no children. 13021 12:35:46,722 --> 12:35:49,112 Let's just add manually\nto this a little further. 13022 12:35:49,112 --> 12:35:52,262 Let's add another number to the\nlist, by mallocing another node. 13023 12:35:52,262 --> 12:35:55,622 I don't need to declare n as a node*\n 13024 12:35:56,262 --> 12:35:58,202 Here's a little safety check. 13025 12:35:58,201 --> 12:36:02,761 I'm going to not bother with my,\n 13026 12:36:07,285 --> 12:36:09,452 We want a free memory too,\nwhich I've not done here 13027 12:36:09,451 --> 12:36:11,131 but I'll save that for another time. 13028 12:36:11,131 --> 12:36:13,471 Here, I'm going to\ninitialize the number to 1. 13029 12:36:13,472 --> 12:36:17,582 I'm going to initialize the children\n 13030 12:36:17,582 --> 12:36:19,292 And now, I'm going to do this. 13031 12:36:19,292 --> 12:36:23,762 Initialize the tree's\nleft child to be n. 13032 12:36:23,762 --> 12:36:26,704 So what that's essentially\ndoing here is if this 13033 12:36:26,703 --> 12:36:29,911 is my root node, the single rectangle\n 13034 12:36:29,911 --> 12:36:32,011 has no children, neither left nor right. 13035 12:36:32,012 --> 12:36:33,961 Here's my new node with the number 1. 13036 12:36:33,961 --> 12:36:36,101 I want it to become the new left child. 13037 12:36:36,101 --> 12:36:39,631 So that line of code on the\nscreen there, tree left equals n 13038 12:36:39,631 --> 12:36:44,201 is like stitching these 2 together\n 13039 12:36:44,701 --> 12:36:47,581 The next lines of code,\nyou can probably guess 13040 12:36:47,582 --> 12:36:50,042 are me adding another\nnumber to the list. 13041 12:36:51,211 --> 12:36:56,682 So this is a simpler tree with\n2, 1, and, 3 respectively. 13042 12:36:56,682 --> 12:36:59,192 And this code, let me wave\nmy hands, is almost the same. 13043 12:36:59,192 --> 12:37:02,492 Except for the fact that I'm\nupdating the tree's right child 13044 12:37:02,491 --> 12:37:04,472 to be this new and third node. 13045 12:37:04,472 --> 12:37:07,862 Let's now run the code before\nlooking at those 2 functions. 13046 12:37:12,991 --> 12:37:16,411 So it sounds like the data structure\n 13047 12:37:16,411 --> 12:37:18,181 But how did I actually print this? 13048 12:37:18,182 --> 12:37:20,072 And then, eventually,\nfree the whole thing? 13049 12:37:20,072 --> 12:37:23,461 Well let's look at the\ndefinition of first print tree. 13050 12:37:23,461 --> 12:37:26,432 And this is where\nthings get interesting. 13051 12:37:26,432 --> 12:37:30,271 Print tree returns nothing\nso it's a void function. 13052 12:37:30,271 --> 12:37:36,002 But it takes a pointer to a root element\n 13053 12:37:37,171 --> 12:37:39,271 If root equals equals\nnull, there's obviously 13054 12:37:39,271 --> 12:37:40,591 nothing to print, just return. 13055 12:37:42,451 --> 12:37:44,491 But here's where things\nget a little magical. 13056 12:37:44,491 --> 12:37:47,761 Otherwise, print your left child. 13057 12:37:50,491 --> 12:37:53,911 Then, print your right child. 13058 12:37:53,911 --> 12:37:59,181 What is this an example of, even\n 13059 12:37:59,182 --> 12:38:00,802 What programming technique here? 13060 12:38:02,398 --> 12:38:05,853 So this is actually perhaps the most\n 13061 12:38:05,853 --> 12:38:08,061 It wasn't really that\ncompelling with the Mario thing 13062 12:38:08,061 --> 12:38:10,191 because we had such an easy\nimplementation with a for-loop loop 13063 12:38:11,031 --> 12:38:15,652 But here is a perfect application of\n 13064 12:38:17,391 --> 12:38:19,701 If you take any snip\nof any branch, it all 13065 12:38:19,701 --> 12:38:22,072 still looks like a tree,\njust a smaller one. 13066 12:38:22,072 --> 12:38:23,911 That lends itself to recursion. 13067 12:38:23,911 --> 12:38:28,491 So here is this leap of faith where I\n 13068 12:38:28,491 --> 12:38:31,311 tree, if you will, via\nmy child at the left. 13069 12:38:31,311 --> 12:38:34,612 Then, I'll print my own root\nnode here in the middle. 13070 12:38:34,612 --> 12:38:37,222 Then, go ahead and\nprint my right sub tree. 13071 12:38:37,222 --> 12:38:41,662 And because we have this base case that\n 13072 12:38:41,661 --> 12:38:44,448 there's nothing to do, you're\nnot going to recurse infinitely. 13073 12:38:44,449 --> 12:38:47,031 You're not going to call yourself\nagain, and again, and again 13074 12:38:48,692 --> 12:38:52,882 So it works out and prints\nthe 1, the 2, and the 3. 13075 12:38:52,881 --> 12:38:54,322 And notice what we could do, too. 13076 12:38:54,322 --> 12:38:57,741 If you wanted to print the tree in\n 13077 12:38:57,741 --> 12:39:00,531 Print your right tree\nfirst, the greater element. 13078 12:39:01,432 --> 12:39:02,811 Then, your smaller sub tree. 13079 12:39:02,811 --> 12:39:05,451 And if I do make tree\nhere and ./tree, well now 13080 12:39:05,451 --> 12:39:07,581 I've reversed the order of the list. 13081 12:39:08,671 --> 12:39:10,421 You can do it with a\nfor-loop in an array. 13082 12:39:10,421 --> 12:39:13,851 But you can also do it, even with\nthis 2-dimensional structure. 13083 12:39:13,851 --> 12:39:17,661 Let's lastly look at\nthis free tree function. 13084 12:39:17,661 --> 12:39:19,641 And this one's almost the same. 13085 12:39:19,641 --> 12:39:22,881 Order doesn't matter in quite the\n 13086 12:39:22,881 --> 12:39:24,502 Here's what I did with free tree. 13087 12:39:24,502 --> 12:39:27,459 Well, if the root of the tree is\n 13088 12:39:28,042 --> 12:39:32,582 Otherwise, go ahead and free your\n 13089 12:39:32,582 --> 12:39:35,572 Then free your right child\nand all of its descendants. 13090 12:39:37,381 --> 12:39:43,171 And again, free literally just\n 13091 12:39:43,171 --> 12:39:45,051 It doesn't free the whole darn thing. 13092 12:39:45,052 --> 12:39:47,332 It just frees literally\nwhat's at that address. 13093 12:39:47,332 --> 12:39:51,382 Why was it important that\nI did line 72 last, though? 13094 12:39:51,381 --> 12:39:53,932 Why did I free the left\nchild and the right child 13095 12:39:53,932 --> 12:39:57,455 before I freed myself, so to speak? 13096 12:39:59,163 --> 12:40:03,621 If you free yourself first, if I had\n 13097 12:40:03,622 --> 12:40:08,302 you're not allowed to touch the left\n 13098 12:40:08,302 --> 12:40:10,832 Because the memory address is\nno longer valid at that point. 13099 12:40:10,832 --> 12:40:12,772 You would get some\nmemory error, perhaps. 13100 12:40:13,792 --> 12:40:15,472 Valgrind definitely wouldn't like it. 13101 12:40:15,472 --> 12:40:17,542 Bad things would otherwise happen. 13102 12:40:17,542 --> 12:40:19,372 But here, then, is an\nexample of recursion. 13103 12:40:19,372 --> 12:40:23,842 And again, just a recursive use\nof an actual data structure. 13104 12:40:23,841 --> 12:40:26,601 And what's even cooler here\nis, relatively speaking 13105 12:40:26,601 --> 12:40:29,121 suppose we wanted to\nsearch something like this. 13106 12:40:29,122 --> 12:40:33,202 Binary search actually gets pretty\n 13107 12:40:33,891 --> 12:40:38,421 here might be the prototype for a search\n 13108 12:40:38,421 --> 12:40:43,402 You give me the root of a tree, and\n 13109 12:40:43,402 --> 12:40:47,362 and I can pretty easily now return true\n 13110 12:40:47,932 --> 12:40:49,912 Well, let's first ask a question. 13111 12:40:49,911 --> 12:40:52,876 If tree equals equals null,\nthen you just return false. 13112 12:40:52,877 --> 12:40:56,002 Because if there's no tree, there's no\n 13113 12:40:57,341 --> 12:41:04,041 Else if, the number you're looking for\n 13114 12:41:04,042 --> 12:41:06,052 which direction should we go? 13115 12:41:08,671 --> 12:41:11,781 Well, let's just return the\nanswer to this question. 13116 12:41:11,781 --> 12:41:15,921 Search the left sub tree,\nby way of my left child 13117 12:41:15,921 --> 12:41:17,451 looking for the same number. 13118 12:41:17,451 --> 12:41:19,731 And you just assume through\nthe beauty of recursion 13119 12:41:19,732 --> 12:41:22,882 that you're kicking the can\nand let yourself figure it out 13120 12:41:24,082 --> 12:41:26,542 Just that snipped left tree instead. 13121 12:41:26,542 --> 12:41:30,802 Else if, the number you're looking for\n 13122 12:41:30,802 --> 12:41:32,641 go to the right, as you might infer. 13123 12:41:32,641 --> 12:41:35,542 So I can just return the\nanswer to this question. 13124 12:41:35,542 --> 12:41:38,631 Search my right sub tree\nfor that same number. 13125 12:41:38,631 --> 12:41:40,502 And there's a fourth\nand final condition. 13126 12:41:40,502 --> 12:41:43,732 What's the fourth scenario we\nhave to consider, explicitly? 13127 12:41:45,262 --> 12:41:47,304 SPEAKER 1: If the number,\nitself, is right there. 13128 12:41:47,303 --> 12:41:50,961 So else if, the number I'm looking\n 13129 12:41:50,961 --> 12:41:53,731 then and only then,\nshould you return true. 13130 12:41:53,732 --> 12:41:55,972 And if you're thinking\nquickly here, there's 13131 12:41:55,972 --> 12:41:59,632 an optimization possible,\nbetter design opportunity. 13132 12:41:59,631 --> 12:42:01,131 Think back to even our scratch days. 13133 12:42:01,131 --> 12:42:03,252 What could we do a little better here? 13134 12:42:06,622 --> 12:42:09,164 Because if there's logically\nonly 4 things that could happen 13135 12:42:09,163 --> 12:42:12,021 you're wasting your time by asking\na fourth gratuitous question. 13136 12:42:13,341 --> 12:42:16,981 So here to, more so than the\nMario example a few weeks ago 13137 12:42:16,982 --> 12:42:19,582 there's just this elegance\narguably to recursion. 13138 12:42:21,442 --> 12:42:25,432 This is the code for binary\nsearch on a binary search tree. 13139 12:42:25,432 --> 12:42:27,502 And so, recursion tends\nto work in lockstep 13140 12:42:27,502 --> 12:42:32,182 with these kinds of data structures\n 13141 12:42:34,161 --> 12:42:39,841 Any questions, then, on binary search\n 13142 12:42:40,709 --> 12:42:42,656 AUDIENCE: About like third years. 13143 12:42:48,211 --> 12:42:54,171 So when returning a Boolean value, true\n 13144 12:42:54,171 --> 12:42:57,832 in a library called Standard\nBool, S-T-D-B-O-O-L dot H. 13145 12:42:57,832 --> 12:42:59,961 With a header file that you can use. 13146 12:42:59,961 --> 12:43:06,739 It is the case that true is, it's\n 13147 12:43:06,739 --> 12:43:08,031 But they would map indeed, yes. 13148 12:43:09,442 --> 12:43:11,872 But you should not compare\nthem explicitly to 0 and 1. 13149 12:43:11,872 --> 12:43:14,872 When you're using true and false, you\n 13150 12:43:14,872 --> 12:43:18,857 AUDIENCE: I meant if\nit's in a code return. 13151 12:43:19,732 --> 12:43:23,332 So if I am in my own code from\nearlier, an avoid function 13152 12:43:23,332 --> 12:43:25,762 it is totally fine to return. 13153 12:43:25,762 --> 12:43:28,432 You just can't return\nsomething explicitly. 13154 12:43:28,432 --> 12:43:30,201 So return just means that's it. 13155 12:43:31,762 --> 12:43:33,632 You're not actually\nhanding back a value. 13156 12:43:33,631 --> 12:43:37,252 So it's a way of short\ncircuiting the execution. 13157 12:43:37,252 --> 12:43:39,531 If you don't like that,\nand some people do frown 13158 12:43:39,531 --> 12:43:44,241 upon having code return from functions\n 13159 12:43:45,531 --> 12:43:49,222 If the root does not equal\nnull, do all of these things. 13160 12:43:49,222 --> 12:43:51,502 And then, indent all three\nof these lines underneath. 13161 12:43:52,972 --> 12:43:54,772 I happen to write it\nthe other way just so 13162 12:43:54,771 --> 12:43:58,471 that there was explicitly a base case\n 13163 12:43:58,472 --> 12:44:01,402 Whereas, now, it's\nimplicitly there for us only. 13164 12:44:03,771 --> 12:44:07,441 So let's ask the question as\nbefore about running time of this. 13165 12:44:07,442 --> 12:44:09,412 It would look like\nbinary search is back. 13166 12:44:09,411 --> 12:44:15,081 And we can now do things in logarithmic\n 13167 12:44:15,082 --> 12:44:17,421 Is this a binary search tree? 13168 12:44:19,141 --> 12:44:21,862 And again, a binary\nsearch tree is a tree 13169 12:44:21,862 --> 12:44:28,599 where the root is greater than its left\n 13170 12:44:29,391 --> 12:44:30,862 So you're nodding your head. 13171 12:44:33,502 --> 12:44:35,512 So this is a binary search tree. 13172 12:44:35,512 --> 12:44:37,872 Is this a binary search tree? 13173 12:44:40,341 --> 12:44:43,191 Or I'm hearing just my delay\nchanging the vote it would seem. 13174 12:44:43,192 --> 12:44:45,562 So this is one of those trick questions. 13175 12:44:45,561 --> 12:44:47,961 This is a binary search\ntree because I've not 13176 12:44:47,961 --> 12:44:50,872 violated the definition\nof what I gave you, right. 13177 12:44:50,872 --> 12:44:56,961 Is there any example of a left child\n 13178 12:44:56,961 --> 12:44:59,961 Or is there any example of a right\n 13179 12:44:59,961 --> 12:45:02,379 That's just the opposite way\nof describing the same thing. 13180 12:45:02,379 --> 12:45:04,552 No, this is a binary search tree. 13181 12:45:04,552 --> 12:45:07,692 Unfortunately, it also looks like,\n 13182 12:45:09,381 --> 12:45:11,451 But you could imagine\nthis happening, right. 13183 12:45:11,451 --> 12:45:14,121 Suppose that I hadn't been as\nthoughtful as I was earlier 13184 12:45:14,122 --> 12:45:17,452 by inserting 2, And then 1, and then 3. 13185 12:45:17,451 --> 12:45:19,641 Which nicely balanced everything out. 13186 12:45:19,641 --> 12:45:22,341 Suppose that instead, because\nof what the user is typing in 13187 12:45:22,341 --> 12:45:25,461 or whatever you contrive in your\n 13188 12:45:27,741 --> 12:45:30,331 Like, you've created a\nproblem for yourself. 13189 12:45:30,332 --> 12:45:33,772 Because if we follow the same logic\n 13190 12:45:33,771 --> 12:45:38,511 this is how you might implement\n 13191 12:45:38,512 --> 12:45:42,232 if you just blindly keep\nfollowing that definition. 13192 12:45:42,232 --> 12:45:44,512 I mean, this would be\nbetter designed as what? 13193 12:45:44,512 --> 12:45:46,972 If we rotated the whole thing around. 13194 12:45:48,351 --> 12:45:50,542 And those kinds of trees\nactually have names. 13195 12:45:50,542 --> 12:45:52,881 There's trees called AVL\ntrees in computer science. 13196 12:45:52,881 --> 12:45:54,531 There are red-black black\ntrees in computer science. 13197 12:45:54,531 --> 12:45:56,781 There are other types of\ntrees that, additionally 13198 12:45:56,781 --> 12:45:59,991 add some logic that tell you\nwhen you got to pivot the thing 13199 12:45:59,991 --> 12:46:03,720 and rotate it, and snip off the\n 13200 12:46:03,720 --> 12:46:05,512 But a binary search\ntree, in and of itself 13201 12:46:05,512 --> 12:46:09,152 does not guarantee that it\nwill be balanced, so to speak. 13202 12:46:09,152 --> 12:46:11,722 And so, if you consider\nthe worst case scenario 13203 12:46:11,722 --> 12:46:13,342 of even using a binary search tree. 13204 12:46:13,341 --> 12:46:15,441 If you're not smart about\nthe code you're writing 13205 12:46:15,442 --> 12:46:17,662 and you just blindly\nfollow this definition 13206 12:46:17,661 --> 12:46:21,771 you might accidentally create a\n 13207 12:46:21,771 --> 12:46:24,531 tree that essentially\nlooks like a linked list. 13208 12:46:24,531 --> 12:46:26,991 Because you're not even using\nany of the left children. 13209 12:46:26,991 --> 12:46:30,231 So unfortunately, the literal\nanswer to the question 13210 12:46:30,232 --> 12:46:32,962 here is what's the\nrunning time of search? 13211 12:46:34,881 --> 12:46:37,461 But not if you don't maintain\nthe balance of the tree. 13212 12:46:37,461 --> 12:46:42,771 Both, in certain search, could actually\n 13213 12:46:44,434 --> 12:46:46,641 If you don't somehow take\ninto account, and we're not 13214 12:46:46,641 --> 12:46:48,201 going to do the code for that here. 13215 12:46:48,201 --> 12:46:51,621 It's a higher level thing you\nmight explore down the road. 13216 12:46:51,622 --> 12:46:55,412 It can devolve into something\nthat you might not have intended. 13217 12:46:55,411 --> 12:46:57,503 And so, now that we're\ntalking about 2 dimensions 13218 12:46:57,504 --> 12:46:59,211 it's really the onus\nis on the programmer 13219 12:46:59,211 --> 12:47:01,972 to consider what kinds of\nperverse situations might happen. 13220 12:47:01,972 --> 12:47:04,342 Where the thing devolves\ninto a structure 13221 12:47:04,341 --> 12:47:07,831 that you don't actually\nwant it to devolve into. 13222 12:47:08,332 --> 12:47:09,842 We've got just a few structures to go. 13223 12:47:09,841 --> 12:47:11,421 Let's go ahead and take one\nmore 5 minute break here. 13224 12:47:11,421 --> 12:47:12,891 When we come back,\nwe'll talk at this level 13225 12:47:12,891 --> 12:47:14,512 about some final applications of this. 13226 12:47:19,341 --> 12:47:22,731 And as promised, we'll operate\nnow at this higher level. 13227 12:47:22,732 --> 12:47:26,002 Where if we take for granted that, even\n 13228 12:47:26,002 --> 12:47:28,794 to play with these techniques yet,\n 13229 12:47:30,262 --> 12:47:33,112 Both in a one dimension\nand even 2 dimensions 13230 12:47:33,112 --> 12:47:35,451 to build things like lists and trees. 13231 12:47:35,451 --> 12:47:37,461 So if we have these building blocks. 13232 12:47:37,461 --> 12:47:40,161 Things like now arrays,\nand lists, and trees 13233 12:47:40,161 --> 12:47:44,271 what if we start to amalgamate\n 13234 12:47:44,271 --> 12:47:46,381 of multiple data structures? 13235 12:47:46,381 --> 12:47:49,841 Can we start to get some of the best\n 13236 12:47:49,841 --> 12:47:51,191 something called a hash table. 13237 12:47:51,192 --> 12:47:55,022 So a hash table is a Swiss\narmy knife of data structures 13238 12:47:55,021 --> 12:47:56,791 in that it's so commonly used. 13239 12:47:56,792 --> 12:48:01,482 Because it allows you to associate\nkeys with value, so to speak. 13240 12:48:01,482 --> 12:48:06,542 So, for instance, it allows you to\n 13241 12:48:08,552 --> 12:48:11,402 Or anything where you have\nto take something as input 13242 12:48:11,402 --> 12:48:13,781 and get as output a corresponding\npiece of information. 13243 12:48:13,781 --> 12:48:16,692 A hash table is often a\ndata structure of choice. 13244 12:48:16,692 --> 12:48:17,942 And here's what it looks like. 13245 12:48:17,942 --> 12:48:20,281 It's actually looks like\nan array, at first glance. 13246 12:48:20,281 --> 12:48:23,472 But for discussion's sake, I've\ndrawn this array vertically 13247 12:48:26,141 --> 12:48:31,201 But it allows you, a hash table, to\n 13248 12:48:32,222 --> 12:48:35,612 So, for instance, there's actually\n26 locations in this array. 13249 12:48:35,612 --> 12:48:38,582 Because I want to, for\ninstance, store initially 13250 12:48:38,582 --> 12:48:41,461 names of people, for instance. 13251 12:48:41,461 --> 12:48:44,135 And wouldn't it be nice if the\nperson's name starts with A 13252 12:48:44,135 --> 12:48:45,302 I have a go to place for it. 13253 12:48:46,262 --> 12:48:48,345 And if it starts with Z,\nI put them at the bottom. 13254 12:48:48,345 --> 12:48:50,552 So that I can jump\ninstantly, arithmetically 13255 12:48:50,552 --> 12:48:52,952 using a little bit of\nAscii or Unicode fanciness 13256 12:48:52,951 --> 12:48:56,021 exactly to the location that\nthey want to they need to go. 13257 12:48:56,021 --> 12:48:58,171 So, for instance, here's\nour array 0 index. 13258 12:48:59,612 --> 12:49:01,982 If I think of this,\nthough, as A through Z 13259 12:49:01,982 --> 12:49:03,852 I'm going to think of\nthese 26 locations 13260 12:49:03,851 --> 12:49:07,112 now in the context of a hash table,\n 13261 12:49:07,112 --> 12:49:09,491 So buckets into which\nyou can put values. 13262 12:49:09,491 --> 12:49:13,862 So, for instance, suppose that we\n 13263 12:49:15,072 --> 12:49:16,741 And that name is say, Albus. 13264 12:49:16,741 --> 12:49:21,461 So Albus starting with A. Albus might\n 13265 12:49:21,961 --> 12:49:23,669 And then, we want to\ninsert another name. 13266 12:49:23,669 --> 12:49:25,112 This one happens to be Zacharias. 13267 12:49:25,112 --> 12:49:28,171 Starting with Z, so it goes all\nthe way at the end of this data 13268 12:49:28,171 --> 12:49:29,972 structure in location 25 a.k.a. 13269 12:49:30,872 --> 12:49:34,742 And then, maybe a third name like\n 13270 12:49:34,741 --> 12:49:36,792 according to that\nposition in the alphabet. 13271 12:49:36,792 --> 12:49:39,542 So this is great because\nin constant time 13272 12:49:39,542 --> 12:49:43,502 I can insert and conversely\nsearch for any of these names 13273 12:49:43,502 --> 12:49:45,182 based on the first letter of their name. 13274 12:49:45,182 --> 12:49:47,580 A, or Z, or H, in this case. 13275 12:49:47,580 --> 12:49:50,372 Let's fast forward and assume we\n 13276 12:49:50,372 --> 12:49:52,382 might look familiar,\ninto this hash table. 13277 12:49:52,381 --> 12:49:56,591 It's great because every\nname has its own location. 13278 12:49:56,591 --> 12:50:00,961 But if you're thinking of names\n 13279 12:50:00,961 --> 12:50:03,192 we eventually encounter a\nproblem with this, right. 13280 12:50:03,192 --> 12:50:06,961 When could something go wrong\nusing a hash table like this 13281 12:50:06,961 --> 12:50:09,572 if we wanted to insert even more names? 13282 12:50:09,572 --> 12:50:11,771 What's going to eventually happen? 13283 12:50:12,271 --> 12:50:14,479 There's already someone with\nthe first letter, right. 13284 12:50:14,480 --> 12:50:17,342 Like I haven't even mentioned\nHarry, for instance, or Hagrid. 13285 12:50:17,341 --> 12:50:19,231 And yet, Hermione's\nalready using that spot. 13286 12:50:19,232 --> 12:50:21,512 So that invites the\nquestion, well, what happens? 13287 12:50:21,512 --> 12:50:25,082 Maybe, if we want to insert Harry\n 13288 12:50:26,192 --> 12:50:28,805 But then if there's a location\nI, where do we put them? 13289 12:50:28,805 --> 12:50:31,472 And it just feels like the situation\ncould very quickly devolve. 13290 12:50:31,472 --> 12:50:34,412 But I've deliberately\ndrawn this data structure 13291 12:50:34,411 --> 12:50:37,471 that I claim as a hash\ntable, in 2 directions. 13292 12:50:39,601 --> 12:50:42,781 But what might this be hinting\nI'm using horizontally 13293 12:50:42,781 --> 12:50:45,781 even though I'm drawing the rectangles\n 13294 12:50:47,239 --> 12:50:48,572 Maybe another array, to be fair. 13295 12:50:48,572 --> 12:50:51,739 But, honestly, arrays are such a pain\n 13296 12:50:52,292 --> 12:50:56,082 These look like the beginnings\nof a linked list, if you will. 13297 12:50:56,082 --> 12:50:59,671 Where the name is where the number\n 13298 12:50:59,671 --> 12:51:01,682 horizontally now just\nfor discussion's sake. 13299 12:51:01,682 --> 12:51:05,281 And this seems to be a pointer\nthat isn't pointing anywhere yet. 13300 12:51:05,281 --> 12:51:10,561 But it looks like the array is 26\n 13301 12:51:11,402 --> 12:51:14,156 Some of which are pointing at\nthe first node in a linked list. 13302 12:51:14,156 --> 12:51:16,531 So that's really what a hash\ntable might be in your mind. 13303 12:51:16,531 --> 12:51:21,309 An amalgam of an array, whose\nelements are linked lists. 13304 12:51:21,309 --> 12:51:23,851 And in theory, this gives you\nthe best of both worlds, right. 13305 12:51:23,851 --> 12:51:26,911 You get random access with\nhigh probability, right. 13306 12:51:26,911 --> 12:51:30,101 You get to jump immediately to the\n 13307 12:51:30,101 --> 12:51:32,911 But, if you run into this perverse\n 13308 12:51:34,351 --> 12:51:37,832 It starts to devolve into a\nlinked list, but it's at least 26 13309 12:51:39,061 --> 12:51:42,151 Not one massive linked list,\nwhich would be Big O of n. 13310 12:51:43,961 --> 12:51:46,112 So if Harry gets inserted in Hagrid. 13311 12:51:46,112 --> 12:51:50,262 Yeah, you have to chain them\ntogether, so to speak, in this way. 13312 12:51:50,262 --> 12:51:53,127 But, at least you've not\npainted yourself into a corner. 13313 12:51:53,127 --> 12:51:56,252 And in fact, if we fast forward and\n 13314 12:51:56,252 --> 12:51:58,601 the data structure\nstarts to look like this. 13315 12:51:58,601 --> 12:52:00,941 So the chains not terribly long. 13316 12:52:00,942 --> 12:52:03,752 And some of them are actually\nof size 0 because there's just 13317 12:52:03,752 --> 12:52:06,631 some unpopular letters of the\nalphabet among these names. 13318 12:52:06,631 --> 12:52:08,581 But it seems better than\njust putting everyone 13319 12:52:08,582 --> 12:52:11,342 in one big array, or\none big linked list. 13320 12:52:11,341 --> 12:52:15,671 We're trying to balance these trade\n 13321 12:52:15,671 --> 12:52:17,891 Well, how might we represent\nsomething like this? 13322 12:52:17,891 --> 12:52:19,622 Here's how we could describe this thing. 13323 12:52:19,622 --> 12:52:22,802 A node in the context of a\nlinked list could be this. 13324 12:52:22,802 --> 12:52:26,342 I have an array called\nword of type char. 13325 12:52:26,341 --> 12:52:30,541 And it's big enough to fit the\n 13326 12:52:30,542 --> 12:52:32,372 And the plus 1 why, probably? 13327 12:52:33,241 --> 12:52:34,211 SPEAKER 1: The null character. 13328 12:52:34,211 --> 12:52:37,322 So I'm assuming that longest word\n 13329 12:52:37,951 --> 12:52:40,216 And it's something big\nlike 40, 100, whatever. 13330 12:52:40,216 --> 12:52:43,291 Whatever the longest word\nin the Harry Potter universe 13331 12:52:43,292 --> 12:52:45,921 is or the English dictionary is. 13332 12:52:45,921 --> 12:52:51,531 Longest word plus 1 should be sufficient\n 13333 12:52:51,531 --> 12:52:53,841 And then, what else does it\neach of these nodes have? 13334 12:52:53,841 --> 12:52:57,541 Well it has a pointer to another node. 13335 12:52:57,542 --> 12:52:59,872 So here's how we might\nimplement the notion of a node 13336 12:52:59,872 --> 12:53:04,192 in the context of storing\nnot integers, but names. 13337 12:53:05,841 --> 12:53:08,841 But how do we decide what\nthe hash table itself is? 13338 12:53:08,841 --> 12:53:12,621 Well, if we now have a definition of a\n 13339 12:53:12,622 --> 12:53:14,992 or even globally, called hash table. 13340 12:53:14,991 --> 12:53:20,391 That itself is an array\nof node* pointers. 13341 12:53:20,391 --> 12:53:22,792 That is an array of pointers to nodes. 13342 12:53:22,792 --> 12:53:24,772 The beginnings of linked lists. 13343 12:53:26,432 --> 12:53:28,565 I proposed, verbally, that it be 26. 13344 12:53:28,565 --> 12:53:30,982 But honestly, if you get a lot\nof collisions, so to speak. 13345 12:53:30,982 --> 12:53:33,105 A lot of H names trying\nto go to the same place. 13346 12:53:33,105 --> 12:53:35,272 Well, maybe, we need to be\nsmarter and not just look 13347 12:53:35,271 --> 12:53:36,688 at the first letter of their name. 13348 12:53:36,688 --> 12:53:38,281 But, maybe, the first and the second. 13349 12:53:38,281 --> 12:53:42,381 So it's H-A and H-E. But wait, no,\n 13350 12:53:42,381 --> 12:53:45,322 But we start to at least make\nthe problem a little less 13351 12:53:45,322 --> 12:53:48,982 impactful by tinkering with\nsomething like the number of buckets 13352 12:53:50,362 --> 12:53:55,042 But how do we decide where someone\n 13353 12:53:55,042 --> 12:53:57,381 Well, it's an old school\nproblem of input and output. 13354 12:53:57,381 --> 12:54:00,741 The input to the problem is going\nto be something like the name. 13355 12:54:00,741 --> 12:54:02,781 And the algorithm in\nthe middle, as of today 13356 12:54:02,781 --> 12:54:05,211 is going to be something\ncalled a hash function. 13357 12:54:05,211 --> 12:54:07,101 A hash function is\ngenerally something that 13358 12:54:07,101 --> 12:54:10,851 takes as input, a string, a\nnumber, whatever, and produces 13359 12:54:10,851 --> 12:54:13,341 as output a location in our context. 13360 12:54:16,972 --> 12:54:19,671 Or whatever the number\nof buckets you want is 13361 12:54:19,671 --> 12:54:23,851 it's going to just tell you where to\n 13362 12:54:23,851 --> 12:54:27,682 So, for instance, Albus, according to\n 13363 12:54:30,052 --> 12:54:32,782 So the hash function, in the\nmiddle of that black box 13364 12:54:32,781 --> 12:54:35,241 is pretty simplistic in this story. 13365 12:54:35,241 --> 12:54:38,841 It's just looking at the Ascii\n 13366 12:54:39,591 --> 12:54:42,631 And then, subtracting\noff what capital A is 65. 13367 12:54:42,631 --> 12:54:46,951 So like doing some math to get\nback in number between 0 and 25. 13368 12:54:46,951 --> 12:54:50,091 So that's how we got to\nthis point in the story. 13369 12:54:50,091 --> 12:54:54,921 And how might we, then, resolve\nthe problem further and use 13370 12:54:54,921 --> 12:54:56,542 this notion of hashing more generally? 13371 12:54:56,542 --> 12:54:58,417 Well just for demonstration\nsake here, here's 13372 12:54:58,417 --> 12:55:00,772 actually some buckets, literally. 13373 12:55:00,771 --> 12:55:03,861 And we've labeled, in advance,\nthese buckets with the suits 13374 12:55:07,252 --> 12:55:12,082 And we've got diamonds here. 13375 12:55:12,082 --> 12:55:15,592 And we've got, what else here? 13376 12:55:19,372 --> 12:55:22,074 So we have a deck of cards\nhere, for instance, right. 13377 12:55:22,074 --> 12:55:24,531 And this is something you,\nyourself, might do instinctively 13378 12:55:24,531 --> 12:55:26,902 if you're getting ready to\nstart playing a game of cards. 13379 12:55:26,902 --> 12:55:29,069 You're just cleaning up or\nyou want things in order. 13380 12:55:29,069 --> 12:55:31,444 Like, here is literally\na jumbo deck of cards. 13381 12:55:31,444 --> 12:55:33,862 What would be the easiest way\nfor me to sort these things? 13382 12:55:33,862 --> 12:55:36,569 Well we've got a whole bunch of\n 13383 12:55:36,569 --> 12:55:39,112 So I could go through like,\nhere's the 3 of diamonds. 13384 12:55:39,112 --> 12:55:41,362 And I could, here let me\nthrow this up on the screen. 13385 12:55:41,362 --> 12:55:43,052 Just so, if you're far in back. 13386 12:55:47,991 --> 12:55:49,612 I could do this in order here. 13387 12:55:49,612 --> 12:55:52,022 But a lot of us, honestly,\nif given a deck of cards. 13388 12:55:52,021 --> 12:55:54,771 And you just want to clean\nit up and sort it in order 13389 12:55:54,771 --> 12:55:56,101 you might do things like this. 13390 12:55:56,101 --> 12:55:59,512 Well here's my input, 3 of diamonds,\n 13391 12:56:03,122 --> 12:56:06,982 And if you keep going through the cards,\n 13392 12:56:10,552 --> 12:56:12,502 And it's still going\nto take you 52 steps. 13393 12:56:12,502 --> 12:56:15,502 But at the end of it, you\nhave hashed all of the cards 13394 12:56:17,091 --> 12:56:19,971 And now you have problems\nof size 13, which 13395 12:56:19,972 --> 12:56:23,512 is a little more tenable than\ndoing one massive 52 card problem. 13396 12:56:23,512 --> 12:56:25,552 You can now do 4, 13 size problems. 13397 12:56:25,552 --> 12:56:29,272 And so hashing is something that even\n 13398 12:56:29,271 --> 12:56:34,161 Taking as input some card, some name,\n 13399 12:56:34,161 --> 12:56:39,441 A temporary pile in which you\nwant to stage things, so to speak. 13400 12:56:39,442 --> 12:56:41,924 But these collisions are inevitable. 13401 12:56:41,923 --> 12:56:44,631 And honestly, if we kept going\n 13402 12:56:44,631 --> 12:56:47,432 some of these chains would get\nlonger, and longer and longer. 13403 12:56:47,432 --> 12:56:50,811 Which means that instead of\ngetting someone's name quickly 13404 12:56:50,811 --> 12:56:53,659 by searching for them\nor inserting them, might 13405 12:56:53,660 --> 12:56:55,202 start taking a decent amount of time. 13406 12:56:55,201 --> 12:56:58,252 So what could we do instead to\nresolve situations like this? 13407 12:56:58,252 --> 12:57:01,851 If the problem, fundamentally, is\n 13408 12:57:01,851 --> 12:57:04,868 popular, H, we need\nto take in more input. 13409 12:57:04,868 --> 12:57:07,201 Not just the first letter but\nmaybe the first 2 letters. 13410 12:57:07,201 --> 12:57:10,252 So if we do that, we\ncan go from A through Z 13411 12:57:10,252 --> 12:57:16,682 to something more extreme like maybe\n 13412 12:57:16,682 --> 12:57:20,152 So that now Harry and Hermione\nend up at different locations. 13413 12:57:20,152 --> 12:57:23,072 But, darn it, Hagrid\nstill collides with Harry. 13414 12:57:24,862 --> 12:57:27,031 The chains aren't quite as long. 13415 12:57:27,031 --> 12:57:28,891 But the problem isn't\nfundamentally gone. 13416 12:57:28,891 --> 12:57:32,122 And in this case here, anyone\nknow how many buckets we just 13417 12:57:32,122 --> 12:57:40,312 increased to, if we now look at not just\n 13418 12:57:42,921 --> 12:57:46,461 So the easy answer to\n26 squared are 676. 13419 12:57:46,461 --> 12:57:48,051 So that's a lot more buckets. 13420 12:57:48,052 --> 12:57:50,522 And this is why I only showed\na few of them on the screen. 13421 12:57:51,411 --> 12:57:54,531 And it spreads things out in particular. 13422 12:57:54,531 --> 12:57:56,121 What if we take this one step further? 13423 12:57:56,122 --> 12:58:01,612 Instead of H-A, we do like H-A-A,\n 13424 12:58:01,612 --> 12:58:03,561 Well now, we have an\neven better situation. 13425 12:58:03,561 --> 12:58:05,961 Because Hermoine has her one spot. 13426 12:58:09,322 --> 12:58:11,362 But there's a trade off here. 13427 12:58:11,362 --> 12:58:14,722 The upside is now, arithmetically,\nwe can find their locations 13428 12:58:17,512 --> 12:58:21,422 But 3 is constant, no matter how many\n 13429 12:58:21,421 --> 12:58:24,633 But what's the downside here? 13430 12:58:27,771 --> 12:58:33,322 We're now up to 17,576 buckets, which\n 13431 12:58:33,322 --> 12:58:35,222 Computers have a lot\nof memory these days. 13432 12:58:35,222 --> 12:58:38,932 But as you can infer,\nI can't really think 13433 12:58:38,932 --> 12:58:43,641 of someone whose name started with\n 13434 12:58:44,313 --> 12:58:46,521 And if we keep going,\ndefinitely don't know of anyone 13435 12:58:46,521 --> 12:58:49,521 whose name started with\nZ-Z-Z or A-A-A. There's 13436 12:58:49,521 --> 12:58:54,871 a lot of not useful combinations\n 13437 12:58:54,872 --> 12:58:58,522 so that you can do a bit of math\n 13438 12:58:59,773 --> 12:59:01,231 But they're just going to be empty. 13439 12:59:01,232 --> 12:59:04,862 So it's a very sparsely\npopulated array, so to speak. 13440 12:59:04,862 --> 12:59:08,122 So what does that really mean\nfor performance, ultimately? 13441 12:59:08,122 --> 12:59:10,882 Well let's consider, again, in\n 13442 12:59:10,881 --> 12:59:14,271 It turns out that a hash\ntable, technically speaking 13443 12:59:14,271 --> 12:59:18,351 is still just going to give us\nBig O of n in the worst case. 13444 12:59:18,951 --> 12:59:21,921 If you have some crazy perverse\n 13445 12:59:21,921 --> 12:59:25,432 has a name that starts with A, or\n 13446 12:59:25,432 --> 12:59:26,722 you just get really unlucky. 13447 12:59:26,722 --> 12:59:28,599 And your chain is massively long. 13448 12:59:28,599 --> 12:59:30,682 Well then, at that point,\nit's just a linked list. 13449 12:59:31,599 --> 12:59:33,862 It's like the perverse\nsituation with the tree, where 13450 12:59:33,862 --> 12:59:39,682 if you insert it without any mind for\n 13451 12:59:39,682 --> 12:59:43,881 But there's a difference here\nbetween a theoretical performance 13452 12:59:45,502 --> 12:59:48,771 If you look back at the\nthe hash table here 13453 12:59:48,771 --> 12:59:55,371 this is absolutely, in practice, going\n 13454 12:59:55,372 --> 12:59:58,342 Mathematically, asymptotically,\nbig O notation, sure. 13455 13:00:00,112 --> 13:00:03,982 But if what we're really caring about\n 13456 13:00:03,982 --> 13:00:06,472 there's something to be said\nfor crafting a data structure. 13457 13:00:06,472 --> 13:00:09,052 That technically, if this data\nwere uniformly distributed 13458 13:00:09,052 --> 13:00:12,932 is 26 times faster than\na linked list alone. 13459 13:00:12,932 --> 13:00:18,201 And so, there's this tension too\nbetween systems, types of CS 13460 13:00:19,328 --> 13:00:21,411 Where yeah, theoretically,\nthese are all the same. 13461 13:00:21,411 --> 13:00:24,141 But in practice, for\nmaking real-world software 13462 13:00:24,141 --> 13:00:29,872 improving this speed by a factor of 26\n 13463 13:00:29,872 --> 13:00:31,652 might actually make a big difference. 13464 13:00:31,652 --> 13:00:33,152 But there's going to be a trade off. 13465 13:00:33,152 --> 13:00:37,022 And that's typically some other\n 13466 13:00:37,521 --> 13:00:40,581 How about another data\nstructure we could build. 13467 13:00:40,582 --> 13:00:43,492 Let me fast forward to\nsomething here called a trie. 13468 13:00:43,491 --> 13:00:46,402 So a trie, a weird\nname in pronunciation. 13469 13:00:46,402 --> 13:00:49,432 Short for retrieval,\npronounced trie typically. 13470 13:00:49,432 --> 13:00:55,162 A trie is a tree that actually\ngives us constant time lookup 13471 13:00:59,572 --> 13:01:04,711 In the world of a trie, you\ncreate a tree out of arrays. 13472 13:01:04,711 --> 13:01:07,042 So we're really getting into\nthe Frankenstein territory 13473 13:01:07,042 --> 13:01:09,802 of just building things up with\nspare parts of data structures 13474 13:01:10,982 --> 13:01:13,942 But the root of a trie\nis, itself, an array. 13475 13:01:16,012 --> 13:01:22,281 Where each element in that\ntrie points to another node 13476 13:01:22,281 --> 13:01:23,991 which is to say another array. 13477 13:01:23,991 --> 13:01:26,961 And each of those locations in\nthe array represents a letter 13478 13:01:26,961 --> 13:01:28,402 of the alphabet like A through Z. 13479 13:01:28,402 --> 13:01:32,452 So for instance, if you wanted to store\n 13480 13:01:32,451 --> 13:01:36,531 not in a hash table, not in a linked\n 13481 13:01:36,531 --> 13:01:41,301 What you would do is hash on every\n 13482 13:01:42,122 --> 13:01:45,532 So a trie is like a multi-tier\nhash table, in a sense. 13483 13:01:45,531 --> 13:01:47,252 Where you first look\nat the first letter 13484 13:01:47,252 --> 13:01:49,959 then the second letter, then the\n 13485 13:01:49,959 --> 13:01:53,421 For instance, each of these\nlocations represents a letter A 13486 13:01:53,421 --> 13:01:56,932 through Z. Suppose I wanted to\ninsert someone's name into this 13487 13:01:56,932 --> 13:02:01,012 that starts with the letter\nH, like Hagrid for instance. 13488 13:02:01,012 --> 13:02:03,842 Well, I go to the location\nH. I see it's null 13489 13:02:03,841 --> 13:02:06,921 which means I need to malloc myself\n 13490 13:02:08,451 --> 13:02:12,291 Then, suppose I want to store the\n 13491 13:02:12,292 --> 13:02:14,914 an A. So I go to that\nlocation in the second node. 13492 13:02:14,913 --> 13:02:16,371 And I see, OK, it's currently null. 13493 13:02:17,413 --> 13:02:19,921 So I allocate another node\nusing malloc or the like. 13494 13:02:19,921 --> 13:02:24,171 And now I have H-A-G. And\nI continue this with R-I-D. 13495 13:02:24,171 --> 13:02:27,722 And then, when I get to the\nbottom of this person's name 13496 13:02:27,722 --> 13:02:30,322 I just have to indicate\nhere in color, but probably 13497 13:02:30,322 --> 13:02:31,762 with a Boolean value or something. 13498 13:02:31,762 --> 13:02:35,672 Like a true value that\nsays, a name stops here. 13499 13:02:35,671 --> 13:02:41,222 So that it's clear that the person's\n 13500 13:02:41,222 --> 13:02:45,752 or H-A-G-R-I. It's H-A-G-R-I-D.\nAnd the D is green 13501 13:02:45,752 --> 13:02:49,082 just to indicate there's like some\n 13502 13:02:49,082 --> 13:02:52,781 This is the node in\nwhich the name stops. 13503 13:02:52,781 --> 13:02:57,722 And if I continue this logic, here's\n 13504 13:02:57,722 --> 13:03:00,902 And here's how I might\ninsert someone like Hermione. 13505 13:03:00,902 --> 13:03:05,491 And what's interesting about the\n 13506 13:03:07,411 --> 13:03:10,471 Which starts to get compelling\nbecause you're reusing space. 13507 13:03:10,472 --> 13:03:15,391 You're using the same nodes\nfor names like H-A-G and H-A-R 13508 13:03:15,391 --> 13:03:17,851 because they share H and an A in common. 13509 13:03:17,851 --> 13:03:20,112 And they all share an H in common. 13510 13:03:20,112 --> 13:03:23,822 So you have this data structure\nnow that, itself, is a tree. 13511 13:03:23,822 --> 13:03:27,572 Each node in the tree\nis, itself, an array. 13512 13:03:27,572 --> 13:03:31,171 And we, therefore, might implement\n 13513 13:03:31,171 --> 13:03:36,676 Every node is containing, I'll\ndo it in reverse order, an array. 13514 13:03:36,677 --> 13:03:39,302 I'll call it children because\nthat's what it really represents. 13515 13:03:39,302 --> 13:03:41,612 Up to 26 children for\neach of these nodes. 13516 13:03:42,911 --> 13:03:45,841 So I might have used just\na constant for number 26 13517 13:03:45,841 --> 13:03:47,881 to give myself 26\nletters of the alphabet. 13518 13:03:47,881 --> 13:03:52,112 And each of those arrays\nstores that many node stars. 13519 13:03:52,112 --> 13:03:54,031 That many pointers to another node. 13520 13:03:54,031 --> 13:03:55,502 And here's an example of the Bool. 13521 13:03:55,502 --> 13:03:58,232 This is what I represented in\ngreen on the slide a moment ago. 13522 13:03:58,232 --> 13:04:00,062 I also need another piece of data. 13523 13:04:00,061 --> 13:04:03,002 Just a 0 or 1, a true\nor false, that says yes. 13524 13:04:03,002 --> 13:04:08,292 A name stops in this node or it's just\n 13525 13:04:08,292 --> 13:04:12,572 But the upside of this is\nthat the height of this tree 13526 13:04:12,572 --> 13:04:15,572 is only as tall as the\nperson's longest name. 13527 13:04:15,572 --> 13:04:22,411 H-A-G-R-I-D or H-E-R-M-O-I-N-E. And\n 13528 13:04:22,411 --> 13:04:26,221 people are in this data structure,\nthere's 3 at the moment 13529 13:04:26,222 --> 13:04:30,632 if there were 3 million, it would\n 13530 13:04:31,982 --> 13:04:37,232 H-E-R-M-I-O-N-E. So, 8 steps total. 13531 13:04:37,232 --> 13:04:42,062 No matter if there's 2 other people,\n 13532 13:04:42,061 --> 13:04:46,141 Because the path to her name\nis always on the same path. 13533 13:04:46,141 --> 13:04:51,031 And if you assume that there's a\n 13534 13:04:51,902 --> 13:04:53,991 Maybe it's 40, 100, whatever. 13535 13:04:53,991 --> 13:04:55,741 Whatever the longest\nname in the world is. 13536 13:04:56,641 --> 13:04:59,112 Maybe it's 40, 100, but that's constant. 13537 13:04:59,112 --> 13:05:02,322 Which is to say that with a\ntrie, technically speaking 13538 13:05:02,322 --> 13:05:06,961 it is the case that your lookup\n 13539 13:05:09,002 --> 13:05:12,061 It's constant time, because\nunlike every other data structure 13540 13:05:12,061 --> 13:05:16,921 we've looked at, with a trie, the amount\n 13541 13:05:16,921 --> 13:05:20,402 or insert one person is\ncompletely independent of how 13542 13:05:20,402 --> 13:05:24,692 many other pieces of data are\nalready in the data structure. 13543 13:05:24,692 --> 13:05:27,452 And this holds true even if one\nname is a prefix of another. 13544 13:05:27,451 --> 13:05:30,854 I don't think there was a Daniel or\n 13545 13:05:31,771 --> 13:05:35,881 But, D-A-N-I-E-L could be one name. 13546 13:05:35,881 --> 13:05:38,470 And, therefore, we have\na true there in green. 13547 13:05:38,470 --> 13:05:40,262 And if there's a longer\nname like Danielle. 13548 13:05:40,262 --> 13:05:42,242 Then, you keep going\nuntil you get to the E. 13549 13:05:42,241 --> 13:05:45,031 So you can still have with\na trie, one name that's 13550 13:05:45,031 --> 13:05:47,141 a substring of another name. 13551 13:05:47,141 --> 13:05:49,862 So it's not as though we've\ncreated a problem there. 13552 13:05:49,862 --> 13:05:51,533 That, too, is still possible. 13553 13:05:51,533 --> 13:05:54,241 But at the end of the day, it only\n 13554 13:05:54,241 --> 13:05:55,891 to find any of these people. 13555 13:05:55,891 --> 13:05:58,802 And again, that's what's\nparticularly compelling. 13556 13:05:58,802 --> 13:06:00,880 That you effectively have\nconstant time lookup. 13557 13:06:01,921 --> 13:06:05,635 We've gone through this whole story\n 13558 13:06:05,635 --> 13:06:07,052 And then, it went up to n squared. 13559 13:06:07,832 --> 13:06:12,912 And now constant time, what's the price\n 13560 13:06:19,021 --> 13:06:21,451 And in fact, tries are not\nactually used that often 13561 13:06:21,451 --> 13:06:24,981 amazing as they might sound\non some CS level here. 13562 13:06:28,216 --> 13:06:30,379 AUDIENCE: Much like a [INAUDIBLE]. 13563 13:06:31,171 --> 13:06:33,091 If you're storing all\nof these darn arrays 13564 13:06:33,091 --> 13:06:36,351 it's, again, a sparsely\npopulated data structure. 13565 13:06:37,351 --> 13:06:41,281 Granted there's only 3 names, but most\n 13566 13:06:42,972 --> 13:06:46,022 So this is an incredibly wide\ndata structure, if you will. 13567 13:06:46,021 --> 13:06:48,521 It uses a huge amount of\nmemory to store the names. 13568 13:06:48,521 --> 13:06:50,341 But again, you've got to pick a lane. 13569 13:06:50,341 --> 13:06:53,461 Either you're going to minimize space\n 13570 13:06:53,461 --> 13:06:56,722 It's not really possible to get\ntruly the best of both worlds. 13571 13:06:56,722 --> 13:06:58,772 You have to decide where\nthe inflection point is 13572 13:06:58,771 --> 13:07:01,591 for the device you're writing\n 13573 13:07:02,942 --> 13:07:06,461 And again, taking all of\nthese things into account. 13574 13:07:06,461 --> 13:07:08,881 So lastly, let's do one\nfurther abstraction. 13575 13:07:08,881 --> 13:07:12,391 So even higher level to discuss\nsomething that are generally 13576 13:07:12,391 --> 13:07:14,444 known as abstract data structures. 13577 13:07:14,444 --> 13:07:16,152 It turns out we could\nspend like all day 13578 13:07:16,152 --> 13:07:17,732 all week, talking about\ndifferent things we 13579 13:07:17,732 --> 13:07:19,182 could build with these data structures. 13580 13:07:19,182 --> 13:07:21,139 But for the most part,\nnow that we have arrays. 13581 13:07:21,139 --> 13:07:23,911 Now that we have linked lists\nor their cousin's trees, which 13582 13:07:24,910 --> 13:07:26,702 And beyond that, there's\neven graphs, where 13583 13:07:26,701 --> 13:07:29,888 the arrows can go in multiple\n 13584 13:07:29,889 --> 13:07:32,222 Now that we have this ability\nto stitch things together 13585 13:07:32,222 --> 13:07:34,272 we can solve all different\ntypes of problems. 13586 13:07:34,271 --> 13:07:38,221 So, for instance, a very\ncommon type of data structure 13587 13:07:38,222 --> 13:07:42,211 to use in a program, or even our\n 13588 13:07:42,211 --> 13:07:46,262 A queue being a data structure\nlike a line outside of a store. 13589 13:07:46,262 --> 13:07:48,332 Where it has what's\ncalled a FIFO property. 13590 13:07:49,722 --> 13:07:52,141 Which is great for fairness,\nat least in the human world. 13591 13:07:52,141 --> 13:07:56,281 And if you've ever waited outside\n 13592 13:07:56,281 --> 13:07:58,472 or some other restaurant\nnearby, presumably 13593 13:07:58,472 --> 13:08:01,262 if you're queuing up at\nthe counter, you want 13594 13:08:01,262 --> 13:08:03,752 them store to maintain a FIFO system. 13595 13:08:05,012 --> 13:08:08,641 So that whoever's first in line gets\n 13596 13:08:08,641 --> 13:08:12,192 So a queue is actually a\ncomputer science term, too. 13597 13:08:12,192 --> 13:08:14,942 And even if you're still in the\n 13598 13:08:14,942 --> 13:08:17,192 there are things you might\nhave heard called printer 13599 13:08:17,192 --> 13:08:19,531 queues, which also do things in order. 13600 13:08:19,531 --> 13:08:21,949 The first person to send\ntheir essay to the printer 13601 13:08:21,949 --> 13:08:24,031 should, ideally, be printed\nbefore the last person 13602 13:08:24,031 --> 13:08:26,402 to send their essay to the printer. 13603 13:08:26,402 --> 13:08:28,202 Again, in the interest of fairness. 13604 13:08:28,201 --> 13:08:29,851 But how can you implement a queue? 13605 13:08:29,851 --> 13:08:32,731 Well, you typically have to\nimplement 2 fundamental operations 13606 13:08:34,292 --> 13:08:37,391 So adding something to it and\nremoving something from it. 13607 13:08:37,391 --> 13:08:41,131 And the interesting thing here is\n 13608 13:08:41,131 --> 13:08:44,131 Well in the human world, you would\n 13609 13:08:44,131 --> 13:08:46,771 for humans to line up from left\nto right, or right to left. 13610 13:08:47,815 --> 13:08:50,732 Like a printer queue, if you send a\n 13611 13:08:50,732 --> 13:08:52,832 a whole bunch of essays\nor documents, well, you 13612 13:08:52,832 --> 13:08:54,912 need a chunk of memory like an array. 13613 13:08:55,411 --> 13:08:57,631 Well, if you use an\narray, what's a problem 13614 13:08:57,631 --> 13:09:01,241 that could happen in the world\nof printing, for instance? 13615 13:09:01,241 --> 13:09:04,502 If you use an array to store all of\n 13616 13:09:04,502 --> 13:09:05,660 AUDIENCE: It can be filled. 13617 13:09:05,660 --> 13:09:07,202 SPEAKER 1: It could be filled, right. 13618 13:09:07,201 --> 13:09:10,502 So if the programmer decided, HP or\n 13619 13:09:10,502 --> 13:09:14,162 oh, you can send like a megabyte worth\n 13620 13:09:14,161 --> 13:09:16,211 At some point you might\nget an error message 13621 13:09:16,211 --> 13:09:17,582 which says, sorry out of memory. 13622 13:09:18,476 --> 13:09:20,851 Which is maybe a reasonable\nsolution, but a little annoy. 13623 13:09:20,851 --> 13:09:24,481 Or HP could write code that maybe\ndynamically resizes the array 13624 13:09:25,152 --> 13:09:27,722 But at that point, maybe they\nshould just use a linked list. 13625 13:09:28,652 --> 13:09:32,372 So there, too, you could\nimplement the notion of a queue 13626 13:09:32,372 --> 13:09:33,720 using a linked list instead. 13627 13:09:33,720 --> 13:09:35,762 You're going to spend more\nmemory, but you're not 13628 13:09:35,762 --> 13:09:38,132 going to run out of space in your array. 13629 13:09:38,131 --> 13:09:39,974 Which might be more compelling. 13630 13:09:39,974 --> 13:09:41,641 This happens even in the physical world. 13631 13:09:41,641 --> 13:09:45,122 You go to the store and you start having\n 13632 13:09:45,122 --> 13:09:49,408 And like, for a really busy store,\n 13633 13:09:49,408 --> 13:09:51,991 But in that case, it tends to\nbe more of an array just because 13634 13:09:51,991 --> 13:09:54,447 of the physical notion\nof humans lining up. 13635 13:09:54,447 --> 13:09:56,072 But there's other data structures, too. 13636 13:09:56,072 --> 13:09:59,197 If you've ever gone to the dining hall\n 13637 13:09:59,197 --> 13:10:04,351 tray, you're typically picking up\n 13638 13:10:04,351 --> 13:10:06,211 not the first tray that was cleaned. 13639 13:10:06,722 --> 13:10:10,652 Because these cafeteria trays\nstack up on top of each other. 13640 13:10:10,652 --> 13:10:13,891 And indeed a stack is another\ntype of abstract data structure. 13641 13:10:13,891 --> 13:10:16,351 In the physical world, it's\nliterally something physical 13642 13:10:18,512 --> 13:10:21,422 Which have what we would\ncall a LIFO property. 13643 13:10:22,942 --> 13:10:24,692 So as these things\ncome out of the washer 13644 13:10:24,692 --> 13:10:27,002 they're putting the most\nrecent ones on the top. 13645 13:10:27,002 --> 13:10:30,722 And then you, the human, are probably\n 13646 13:10:30,722 --> 13:10:33,182 Which means in the\nextreme, no one on campus 13647 13:10:33,182 --> 13:10:36,616 might ever use that very first tray. 13648 13:10:36,616 --> 13:10:38,491 Which is probably fine\nin the world of trays 13649 13:10:38,491 --> 13:10:42,451 but would really be bad in the world of\n 13650 13:10:42,451 --> 13:10:44,252 were the property being implemented. 13651 13:10:44,252 --> 13:10:46,322 But here, too, it could be an array. 13652 13:10:47,432 --> 13:10:49,014 And you see this, honestly, every day. 13653 13:10:49,014 --> 13:10:51,241 If you're using Gmail\nand your Gmail inbox. 13654 13:10:51,241 --> 13:10:53,761 That is actually a stack,\nat least by default 13655 13:10:53,762 --> 13:10:57,160 where your newest message\nlast in are the first ones 13656 13:10:58,201 --> 13:11:00,061 That's a LIFO data structure. 13657 13:11:00,061 --> 13:11:02,191 And it means that you see\nyour most recent emails. 13658 13:11:02,192 --> 13:11:04,650 But if you have a busy day,\nyou're getting a lot of emails 13659 13:11:04,650 --> 13:11:05,912 it might not be a good thing. 13660 13:11:05,911 --> 13:11:08,311 Because now you're ignoring\nthe people who wrote you 13661 13:11:08,311 --> 13:11:10,621 way earlier in the day or the week. 13662 13:11:10,622 --> 13:11:13,082 So LIFO and FIFO are\njust properties that you 13663 13:11:13,082 --> 13:11:15,842 can achieve with these very\nspecific types of data structures. 13664 13:11:15,841 --> 13:11:17,591 And the parliaments\nin the world of stacks 13665 13:11:17,591 --> 13:11:21,451 is to push something onto a\nstack or pop something out. 13666 13:11:21,451 --> 13:11:23,641 These are here, for instance,\nas an example of why 13667 13:11:23,641 --> 13:11:24,932 might you always wear the same color. 13668 13:11:24,932 --> 13:11:27,192 Well, if you're storing all\nof your clothes in a stack 13669 13:11:27,192 --> 13:11:29,012 you might not ever get\nto the different colored 13670 13:11:29,012 --> 13:11:30,452 clothes at the bottom of the list. 13671 13:11:30,451 --> 13:11:35,371 And in fact, to paint this picture,\n 13672 13:11:35,372 --> 13:11:38,372 Just to paint this here, made\nby a faculty member elsewhere. 13673 13:11:38,372 --> 13:11:41,312 Let's go ahead and dim the lights\nfor just a minute or 2 here. 13674 13:11:41,311 --> 13:11:45,466 So that we can take a look\nat Jack learning some facts. 13675 13:11:46,091 --> 13:11:48,841 SPEAKER 2: Once upon a time,\nthere was a guy named Jack. 13676 13:11:48,841 --> 13:11:52,231 When it came to making friends\nJack did not have the knack. 13677 13:11:52,232 --> 13:11:55,202 So Jack went to talk to the\nmost popular guy he knew. 13678 13:11:55,201 --> 13:11:57,871 He went up to Lou and\nasked, what do I do? 13679 13:11:57,872 --> 13:12:00,332 Lou saw that his friend\nwas really distressed. 13680 13:12:00,332 --> 13:12:03,042 Well, Lou began, just\nlook how you're dressed. 13681 13:12:03,042 --> 13:12:05,612 Don't you have any clothes\nwith a different look? 13682 13:12:08,012 --> 13:12:10,202 Come to my house and\nI'll showed them to you. 13683 13:12:10,201 --> 13:12:11,491 So they went off the Jack's. 13684 13:12:11,491 --> 13:12:15,182 And Jack showed Lou the box, where he\n 13685 13:12:16,232 --> 13:12:19,202 Lou said, I see you have\nall your clothes in a pile. 13686 13:12:19,201 --> 13:12:21,781 Why don't you wear some\nothers once in a while? 13687 13:12:21,781 --> 13:12:24,932 Jack said, well, when I\nremove clothes and socks 13688 13:12:24,932 --> 13:12:27,662 I wash them and put\nthem away in the box. 13689 13:12:27,661 --> 13:12:30,151 Then comes the next\nmorning and up I hop. 13690 13:12:30,152 --> 13:12:33,391 I go to the box and get\nmy clothes off the top. 13691 13:12:33,391 --> 13:12:36,002 Lou quickly realized\nthe problem with Jack. 13692 13:12:36,002 --> 13:12:38,972 He kept clothes, CDs,\nand books in a stack. 13693 13:12:38,972 --> 13:12:41,402 When he'd reached for\nsomething to read or to wear 13694 13:12:41,402 --> 13:12:44,012 he chose a top book or underwear. 13695 13:12:44,012 --> 13:12:46,402 Then when he was done he\nwould put it right back. 13696 13:12:46,402 --> 13:12:48,982 Back it would go on top of the stack. 13697 13:12:48,982 --> 13:12:51,352 I know the solution,\nsaid a triumphant Lou. 13698 13:12:51,351 --> 13:12:53,991 You need to learn to\nstart using a queue. 13699 13:12:53,991 --> 13:12:56,781 Lou took Jack's clothes\nand hung them in a closet. 13700 13:12:56,781 --> 13:12:59,601 And when he had emptied\nthe box, he just tossed it. 13701 13:12:59,601 --> 13:13:03,472 Then he said, now Jack, at the end of\n 13702 13:13:04,951 --> 13:13:07,671 Then tomorrow morning when\nyou see the sunshine, get 13703 13:13:07,671 --> 13:13:10,402 your clothes from the right,\nfrom the end of the line. 13704 13:13:10,402 --> 13:13:13,281 Don't you see, said\nLou, it will be so nice. 13705 13:13:13,281 --> 13:13:16,612 You'll wear everything once\nbefore you wear something twice. 13706 13:13:16,612 --> 13:13:19,552 And with everything in queues\nin his closet and shelf 13707 13:13:19,552 --> 13:13:22,162 Jack started to feel\nquite sure of himself. 13708 13:13:22,161 --> 13:13:24,636 All thanks to Lou and\nhis wonderful queue. 13709 13:13:26,701 --> 13:13:29,701 SPEAKER 1: So just to help you realize\n 13710 13:13:33,862 --> 13:13:35,542 If you've ever lined up at this place. 13711 13:13:37,461 --> 13:13:40,281 OK, so sweetgreen, little\nsalad place in the square. 13712 13:13:40,281 --> 13:13:42,171 This is if you order\nonline or in advance 13713 13:13:42,171 --> 13:13:44,713 your food ends up according to\nthe first letter in your name. 13714 13:13:44,713 --> 13:13:46,963 Which actually sounds awfully\nreminiscent of something 13715 13:13:47,781 --> 13:13:50,841 And in fact, no matter whether\n 13716 13:13:50,841 --> 13:13:52,611 did, with an array and linked list. 13717 13:13:52,612 --> 13:13:54,817 Or with 3 shelves like this. 13718 13:13:54,817 --> 13:13:57,802 This is actually an abstract\ndata type called a dictionary. 13719 13:13:57,802 --> 13:14:01,162 And a dictionary, just like in our\n 13720 13:14:01,161 --> 13:14:02,871 Words and their definitions. 13721 13:14:02,872 --> 13:14:07,372 This just has letters of the\nalphabet and salads as their value. 13722 13:14:07,372 --> 13:14:09,742 But here, too, there's\na real world constraint. 13723 13:14:09,741 --> 13:14:13,222 In what kind of scenario does\nthis system at sweetgreen 13724 13:14:13,222 --> 13:14:15,891 devolve into a problem, for instance? 13725 13:14:15,891 --> 13:14:19,582 Because they, too, are using only\nfinite space, finite storage. 13726 13:14:22,012 --> 13:14:23,391 If they run out of space\non the shelf and there's 13727 13:14:23,391 --> 13:14:25,862 a lot of people whose names\nstart with D, or E, or whatever. 13728 13:14:26,781 --> 13:14:29,362 And then, maybe, they kind of\noverflow into the E's or the F's. 13729 13:14:29,362 --> 13:14:31,281 And they probably don't\nreally care because any human 13730 13:14:31,281 --> 13:14:33,771 is going to come by, and just\n 13731 13:14:33,771 --> 13:14:36,261 But in the world of a\ncomputer, you're the one coding 13732 13:14:36,262 --> 13:14:38,152 and have to be ever so precise. 13733 13:14:38,152 --> 13:14:41,722 We thought we would lastly\ndo one final thing here. 13734 13:14:41,722 --> 13:14:45,527 In advance, we prepared a linked\nlist of sorts in the audience. 13735 13:14:45,527 --> 13:14:47,152 Since this has become a bit of a thing. 13736 13:14:47,152 --> 13:14:50,012 I am starting to represent the\nbeginning of this linked list. 13737 13:14:50,012 --> 13:14:54,592 And so far as I have a pointer\nhere with seat location G9. 13738 13:14:54,591 --> 13:14:57,981 Whoever is in G9, would\nyou mind standing up? 13739 13:14:57,982 --> 13:15:00,652 And what letter is on your sheet there? 13740 13:15:01,582 --> 13:15:04,132 SPEAKER 1: OK, so you\nhave S15 and your letter-- 13741 13:15:07,161 --> 13:15:09,471 So I see you're holding\na C in your node. 13742 13:15:09,472 --> 13:15:12,982 You are pointing to, if\nyou could physically, F15. 13743 13:15:15,262 --> 13:15:17,872 SPEAKER 1: You have an S. And\nwho should you be pointing at? 13744 13:15:26,302 --> 13:15:30,502 F12, if you'd like to stand up holding\n 13745 13:16:54,512 --> 13:16:58,322 DAVID J. MALAN: All right, this is\n 13746 13:16:58,322 --> 13:17:00,980 And this is the week in which\nyou learn yet another language. 13747 13:17:00,980 --> 13:17:03,272 But the goal is not just to\nteach you another language 13748 13:17:03,271 --> 13:17:06,002 for languages sake,\nas we transition today 13749 13:17:06,002 --> 13:17:09,302 and in the coming weeks from C, where\n 13750 13:17:09,961 --> 13:17:14,051 The goal ultimately is to teach you all\n 13751 13:17:14,052 --> 13:17:16,542 so that by the end of this\ncourse, it's not in your mind 13752 13:17:16,542 --> 13:17:19,232 the fact that you learned\nhow to program in C 13753 13:17:19,232 --> 13:17:21,482 or learned some weeks back\nhow to program in Scratch 13754 13:17:21,482 --> 13:17:24,692 but really how you learned\nhow to program fundamentally 13755 13:17:24,692 --> 13:17:27,152 in a paradigm known as\nprocedural programming 13756 13:17:27,152 --> 13:17:29,972 as well as with some taste\ntoday, and in the weeks to come 13757 13:17:29,972 --> 13:17:31,832 of other aspects of\nprogramming languages 13758 13:17:31,832 --> 13:17:34,531 like object-oriented\nprogramming, and more. 13759 13:17:34,531 --> 13:17:36,701 So recall, though, back\nin week zero, Hello, world 13760 13:17:36,701 --> 13:17:38,201 looked a little something like this. 13761 13:17:38,201 --> 13:17:39,908 And the world was quite simple. 13762 13:17:39,908 --> 13:17:42,241 All you had to do was drag\nand drop these puzzle pieces. 13763 13:17:42,241 --> 13:17:45,481 But there were still functions and\n 13764 13:17:45,482 --> 13:17:47,552 and all of those kinds of primitives. 13765 13:17:47,552 --> 13:17:50,822 We then transitioned, of course,\n 13766 13:17:50,822 --> 13:17:52,362 looked a little something like this. 13767 13:17:52,362 --> 13:17:54,319 And even now, some weeks\nlater, you might still 13768 13:17:54,319 --> 13:17:56,991 be struggling with some of the\nsyntax or getting annoying bugs 13769 13:17:56,991 --> 13:17:59,491 when you try to compile your\ncode, and it just doesn't work. 13770 13:17:59,491 --> 13:18:01,322 But there, too, the\npast few weeks, we've 13771 13:18:01,322 --> 13:18:04,652 been focusing on functions and loops\n 13772 13:18:06,072 --> 13:18:10,232 And so what we begin to do today\n 13773 13:18:10,232 --> 13:18:15,362 we're using, transitioning from C now\n 13774 13:18:15,362 --> 13:18:18,722 program in Python, and look\nat its relative simplicity 13775 13:18:18,722 --> 13:18:20,461 but also transitioning\nto look at how you 13776 13:18:20,461 --> 13:18:22,322 can implement these\nsame kinds of features 13777 13:18:22,322 --> 13:18:23,951 just using a different language. 13778 13:18:23,951 --> 13:18:25,771 So we're going to see\na lot of code today. 13779 13:18:25,771 --> 13:18:29,671 And you won't have nearly as much\n 13780 13:18:29,671 --> 13:18:32,731 But that's because so many of the\n 13781 13:18:32,732 --> 13:18:35,102 And, really, it's going to be a\n 13782 13:18:35,934 --> 13:18:38,281 I know how to do it in C.\nHow do I do this in Python? 13783 13:18:38,281 --> 13:18:39,512 How do I do the same with conditionals? 13784 13:18:39,512 --> 13:18:41,232 How do I declare\nvariables, and the like 13785 13:18:41,232 --> 13:18:43,982 and moving forward, not just in\nCS50, but in life in general 13786 13:18:43,982 --> 13:18:47,282 if you continue programming and learn\n 13787 13:18:47,281 --> 13:18:50,792 if in 5-10 years, there's a new, more\n 13788 13:18:50,792 --> 13:18:53,042 it's just going to be a\nmatter of googling and looking 13789 13:18:53,042 --> 13:18:54,932 at websites like Stack\nOverflow and the like 13790 13:18:54,932 --> 13:18:57,872 to look at just basic building\nblocks of programming languages 13791 13:18:57,872 --> 13:19:01,202 because you already speak,\nafter these past 6 plus weeks 13792 13:19:01,201 --> 13:19:04,021 you already speak programming\nitself fundamentally. 13793 13:19:04,021 --> 13:19:07,591 All right, so let's do a few quick\n 13794 13:19:07,591 --> 13:19:09,481 something might have\nlooked like in Scratch 13795 13:19:09,482 --> 13:19:11,342 and what it then looked\nlike in C, but now 13796 13:19:11,341 --> 13:19:13,291 as of today, what it's going\nto look like in Python. 13797 13:19:13,292 --> 13:19:15,375 Then we'll turn our attention\nto the command line 13798 13:19:15,375 --> 13:19:19,031 ultimately, in order to\nimplement some actual programs. 13799 13:19:19,031 --> 13:19:22,262 So in Scratch, we had\nfunctions like this, say Hello 13800 13:19:23,792 --> 13:19:26,262 In C it looked a little\nsomething like this 13801 13:19:26,262 --> 13:19:29,672 and a bit of a cryptic mess the\nfirst week, you had the printf 13802 13:19:30,811 --> 13:19:32,502 You had the semicolon, the parentheses. 13803 13:19:32,502 --> 13:19:34,944 So there's a lot more syntax\njust to do the same thing. 13804 13:19:34,944 --> 13:19:37,862 We're not going to get rid of all\n 13805 13:19:37,862 --> 13:19:42,101 in Python, that same statement is going\n 13806 13:19:42,101 --> 13:19:44,161 And just to perhaps call\nout the obvious, what 13807 13:19:44,161 --> 13:19:48,572 is different or, now, simpler\nin Python versus C, even 13808 13:19:48,572 --> 13:19:50,161 in this simple example here? 13809 13:19:51,067 --> 13:19:53,942 AUDIENCE: Now print, instead of\n 13810 13:19:53,942 --> 13:19:56,359 DAVID J. MALAN: Good, so it's\nnow print instead of printf. 13811 13:19:56,358 --> 13:19:57,631 And there's also no semicolon. 13812 13:19:57,631 --> 13:19:59,625 And there's one other\nsubtlety, over here. 13813 13:20:00,542 --> 13:20:02,162 DAVID J. MALAN: Yeah,\nso no new line, and that 13814 13:20:02,161 --> 13:20:03,631 doesn't mean it's not\ngoing to be printed. 13815 13:20:03,631 --> 13:20:05,923 It just turns out that one\nof the differences we'll see 13816 13:20:05,923 --> 13:20:08,161 is that, with print, you\nget the new line for free. 13817 13:20:08,161 --> 13:20:11,471 It automatically gets outputted by\n 13818 13:20:11,472 --> 13:20:13,711 But you can override it,\nwe'll see, ultimately, too. 13819 13:20:14,822 --> 13:20:18,603 We had multiple functions like\n 13820 13:20:18,603 --> 13:20:20,311 on the screen, but\nalso asked a question 13821 13:20:20,311 --> 13:20:23,822 thereby being another function that\n 13822 13:20:23,822 --> 13:20:26,252 In C we saw code that\nlooked a little something 13823 13:20:26,252 --> 13:20:29,942 like this, whereby that first line\n 13824 13:20:29,942 --> 13:20:32,312 sets it equal to the\nreturn value of getString 13825 13:20:32,311 --> 13:20:34,261 one of the functions\nfrom the CS50 library 13826 13:20:34,262 --> 13:20:37,502 and then the same double quotes\nand parentheses and semicolon. 13827 13:20:37,502 --> 13:20:41,912 Then we had this format code\nin C that allowed us, with %S 13828 13:20:41,911 --> 13:20:44,281 to actually print out that same value. 13829 13:20:44,281 --> 13:20:46,921 In Python, this, too, is going\nto look a little bit simpler. 13830 13:20:46,921 --> 13:20:49,981 Instead, we're going to have\nanswer equals getString 13831 13:20:49,982 --> 13:20:52,592 quote unquote "What\'s your\nname," and then print 13832 13:20:52,591 --> 13:20:55,391 with a plus sign and a\nlittle bit of new syntax. 13833 13:20:55,391 --> 13:20:58,171 But let's see if we can't just\ninfer from this example what 13834 13:20:59,381 --> 13:21:02,191 Well, first missing on the left is what? 13835 13:21:02,192 --> 13:21:05,141 To the left of the equal sign,\nthere's no what this time? 13836 13:21:05,141 --> 13:21:06,391 Feel free to just call it out. 13837 13:21:07,211 --> 13:21:07,981 DAVID J. MALAN: So there's no type. 13838 13:21:07,982 --> 13:21:10,292 There's no type, like\nthe word string, which 13839 13:21:10,292 --> 13:21:14,612 even though that was a type in\nCS50, every other variable in C 13840 13:21:14,612 --> 13:21:17,959 did we use Int or string or\nfloat, or Bool or something else. 13841 13:21:17,959 --> 13:21:20,042 In Python, there are still\ngoing to be data types 13842 13:21:20,042 --> 13:21:22,502 today onward, but you,\nthe programmer, don't 13843 13:21:22,502 --> 13:21:25,563 have to bother telling the\ncomputer what types you're using. 13844 13:21:25,563 --> 13:21:27,271 The computer is going\nto be smart enough 13845 13:21:27,271 --> 13:21:29,761 the language, really, is going to be\n 13846 13:21:30,781 --> 13:21:32,671 Meanwhile, on the right\nhand side, getString 13847 13:21:32,671 --> 13:21:34,379 is going to be a\nfunction we'll use today 13848 13:21:34,379 --> 13:21:37,841 and this week, which comes from a\n 13849 13:21:37,841 --> 13:21:40,891 But we'll also start to take off\n 13850 13:21:40,891 --> 13:21:44,192 see how to do things without\nany CS50 library moving forward 13851 13:21:44,192 --> 13:21:45,812 using a different function instead. 13852 13:21:45,811 --> 13:21:49,441 As before, no semicolon, but the rest\n 13853 13:21:49,951 --> 13:21:52,534 This starts, of course, to get\na little bit different, though. 13854 13:21:52,535 --> 13:21:54,172 We're using print instead of printf. 13855 13:21:54,171 --> 13:21:57,381 But now, even though this\nlooks a little cryptic 13856 13:21:57,381 --> 13:21:59,631 perhaps, if you've never\nprogrammed before CS50 13857 13:21:59,631 --> 13:22:03,651 what might that plus be doing,\njust based on inference here. 13858 13:22:04,402 --> 13:22:08,241 AUDIENCE: Adding answer\nto the string Hello. 13859 13:22:08,241 --> 13:22:11,511 DAVID J. MALAN: Yeah, so adding\nanswer to the string Hello 13860 13:22:11,512 --> 13:22:13,552 and adding, so to speak,\nnot mathematically 13861 13:22:13,552 --> 13:22:16,102 but in the form of joining\nthem together, much like we 13862 13:22:16,101 --> 13:22:19,561 saw the joined block in Scratch, or\n 13863 13:22:20,061 --> 13:22:23,331 This plus sign appends,\nif you will, whatever's 13864 13:22:23,332 --> 13:22:25,147 in answer to whatever is quoted here. 13865 13:22:25,146 --> 13:22:27,771 And I deliberately left a space\nthere, so that grammatically it 13866 13:22:27,771 --> 13:22:29,943 looks nice, after the comma as well. 13867 13:22:29,944 --> 13:22:31,402 Now there's another way to do this. 13868 13:22:31,402 --> 13:22:33,652 And it, too, is going to\nlook cryptic at first glance. 13869 13:22:33,652 --> 13:22:36,031 But it just gets easier and\nmore convenient over time. 13870 13:22:36,031 --> 13:22:41,101 You can also change this second\nline to be this, instead. 13871 13:22:42,292 --> 13:22:45,232 This is actually a relatively new\n 13872 13:22:45,232 --> 13:22:47,542 of years, where now what\nyou're seeing is, yes 13873 13:22:47,542 --> 13:22:50,101 a string, between these\nsame double quotes 13874 13:22:50,101 --> 13:22:53,597 but this is what Python would\ncall a format string, or Fstring. 13875 13:22:53,597 --> 13:22:56,722 And it literally starts with the letter\n 13876 13:22:57,502 --> 13:23:01,222 But that just indicates\nthat Python should 13877 13:23:01,222 --> 13:23:05,632 assume that anything inside of\ncurly braces inside of the string 13878 13:23:05,631 --> 13:23:09,081 should be interpolated, so to\n 13879 13:23:09,082 --> 13:23:12,682 substitute the value of\nany variables therein. 13880 13:23:12,682 --> 13:23:14,552 And it can do some other things as well. 13881 13:23:14,552 --> 13:23:18,562 So answer is a variable, declared,\n 13882 13:23:18,561 --> 13:23:22,822 This Fstring, then, says to Python,\n 13883 13:23:24,472 --> 13:23:28,912 If, by contrast, you\nomitted the curly braces 13884 13:23:28,911 --> 13:23:30,561 just take a guess, what would happen? 13885 13:23:30,561 --> 13:23:33,441 What would the symptom of that\nbug be, if you accidentally 13886 13:23:33,442 --> 13:23:36,531 forgot the curly braces, but\nmaybe still had the F there? 13887 13:23:36,531 --> 13:23:38,271 AUDIENCE: It would print below it, too. 13888 13:23:38,271 --> 13:23:40,822 DAVID J. MALAN: Yeah, it would literally\n 13889 13:23:41,722 --> 13:23:44,211 So the curly braces just kind\nof allow you to plug things in. 13890 13:23:44,211 --> 13:23:45,872 And, again, it looks\na little more cryptic 13891 13:23:45,872 --> 13:23:47,789 but it's just going to\nsave us time over time. 13892 13:23:47,788 --> 13:23:50,641 And if any of you programmed in\n 13893 13:23:50,641 --> 13:23:53,152 you saw plus in that context,\ntoo, for concatenation. 13894 13:23:53,152 --> 13:23:56,277 This just kind of makes your code a\n 13895 13:23:56,277 --> 13:23:58,252 So it's a convenient\nfeature now in Python. 13896 13:23:58,252 --> 13:24:00,711 All right, this was an example\nin Scratch of a variable 13897 13:24:00,711 --> 13:24:03,262 setting a variable like\ncounter equal to 0. 13898 13:24:03,262 --> 13:24:06,982 In C it looked like this, where\nyou specify the type, the name 13899 13:24:06,982 --> 13:24:08,752 and then the value, with a semicolon. 13900 13:24:08,752 --> 13:24:11,618 In Python, it's going to look like this. 13901 13:24:11,618 --> 13:24:12,951 And I'll state the obvious here. 13902 13:24:12,951 --> 13:24:15,862 You don't need to mention the\n 13903 13:24:15,862 --> 13:24:17,552 And you don't need a semicolon. 13904 13:24:18,652 --> 13:24:21,527 If you want a variable, just write\n 13905 13:24:21,527 --> 13:24:24,592 But the single equal sign\nstill behaves the same as in C. 13906 13:24:24,591 --> 13:24:26,961 Suppose we wanted to\nincrement counter by one. 13907 13:24:26,961 --> 13:24:29,271 In Scratch, we use\nthis puzzle piece here. 13908 13:24:29,271 --> 13:24:31,771 In C, we could do this, actually,\nin a few different ways. 13909 13:24:31,771 --> 13:24:33,921 There was this way, if\ncounter already exists 13910 13:24:33,921 --> 13:24:36,502 you just say counter\nequals counter plus 1. 13911 13:24:36,502 --> 13:24:41,362 There was the slightly less verbose\n 13912 13:24:41,362 --> 13:24:42,921 Let me do the first sentence first. 13913 13:24:42,921 --> 13:24:45,211 In Python, that same\nthing, as you might guess 13914 13:24:45,211 --> 13:24:48,682 is actually going to be almost the\n 13915 13:24:48,682 --> 13:24:51,891 And the mathematics are ultimately\n 13916 13:24:51,891 --> 13:24:53,811 via the assignment operator. 13917 13:24:53,811 --> 13:24:56,091 Now, recall, in C, that\nwe had this shorthand 13918 13:24:56,091 --> 13:24:58,521 notation, which did the same thing. 13919 13:24:58,521 --> 13:25:03,502 In Python, you can similarly do the same\n 13920 13:25:03,502 --> 13:25:05,811 The only step backwards\nwe're taking, if you 13921 13:25:05,811 --> 13:25:10,311 were a big fan of counter plus\n 13922 13:25:12,021 --> 13:25:16,731 You have to do the plus equals 1\nor plus/minus or minus equals 1 13923 13:25:16,732 --> 13:25:20,242 to achieve that same result. All\nright, how about in Python 2? 13924 13:25:20,241 --> 13:25:22,881 Here in Scratch, recall,\nwas a conditional 13925 13:25:22,881 --> 13:25:26,511 asking a silly question like is x less\n 13926 13:25:26,512 --> 13:25:30,502 In C, that looked a little\nsomething like this, printf and if 13927 13:25:30,502 --> 13:25:33,832 with the parentheses, the curly\n 13928 13:25:33,832 --> 13:25:37,132 In Python, this is going to get a\n 13929 13:25:39,841 --> 13:25:42,981 And if someone wants to call out\n 13930 13:25:42,982 --> 13:25:46,887 what has been simplified now in Python\n 13931 13:25:46,887 --> 13:25:48,262 Yeah, what's missing, or changed? 13932 13:25:48,872 --> 13:25:49,927 DAVID J. MALAN: So no curly braces. 13933 13:25:51,891 --> 13:25:53,031 AUDIENCE: Using the colon instead. 13934 13:25:53,031 --> 13:25:55,114 DAVID J. MALAN: And we're\nusing the colon instead. 13935 13:25:55,114 --> 13:25:57,141 So I got rid of the\ncurly braces in Python. 13936 13:25:57,141 --> 13:25:58,714 But I'm using a colon instead. 13937 13:25:58,714 --> 13:26:00,631 And even though this is\na single line of code 13938 13:26:00,631 --> 13:26:04,971 so long as you indent subsequent\nlines along with the printf 13939 13:26:04,972 --> 13:26:09,351 that's going to imply that everything,\n 13940 13:26:09,351 --> 13:26:13,491 should be executed below it, until you\n 13941 13:26:13,491 --> 13:26:14,991 a different line of code altogether. 13942 13:26:14,991 --> 13:26:17,521 So indentation in Python is important. 13943 13:26:17,521 --> 13:26:21,621 So this is among the reasons\nwe've emphasized axes like style 13944 13:26:21,622 --> 13:26:23,362 just how well styled your code is. 13945 13:26:23,362 --> 13:26:25,881 And honestly, we've seen,\ncertainly, in office hours 13946 13:26:25,881 --> 13:26:28,521 and you've seen in your own code,\nsort of a tendency sometimes 13947 13:26:28,521 --> 13:26:31,551 to be a little lax when it\ncomes to indentation, right? 13948 13:26:31,552 --> 13:26:34,192 If you're one of those folks\nwho likes to indent everything 13949 13:26:34,192 --> 13:26:37,732 on the left hand side of the window,\n 13950 13:26:37,732 --> 13:26:41,392 But it's not particularly\nreadable by you or anyone else. 13951 13:26:41,391 --> 13:26:45,112 Python actually addresses this\nby just requiring indentation 13952 13:26:46,311 --> 13:26:50,572 So Python is going to force you to start\n 13953 13:26:50,572 --> 13:26:53,201 perhaps, a tendency otherwise. 13954 13:26:54,141 --> 13:26:55,572 Well, we have no semicolon here. 13955 13:26:55,572 --> 13:26:57,671 Of course, it's print instead of printf. 13956 13:26:57,671 --> 13:27:00,341 But otherwise, those seem to\nbe the primary differences. 13957 13:27:00,341 --> 13:27:02,201 What about something larger in Scratch? 13958 13:27:02,201 --> 13:27:05,333 If an if-else block, like\nthis, you can perhaps 13959 13:27:05,334 --> 13:27:06,792 guess what it's going to look like. 13960 13:27:06,792 --> 13:27:10,061 In C it looks like this, curly\nbraces semicolons, and so forth. 13961 13:27:10,061 --> 13:27:14,051 In Python, it's going to now\nlook like this, almost the same 13962 13:27:14,052 --> 13:27:15,342 but indentation is important. 13963 13:27:16,482 --> 13:27:19,332 And there's one other difference\nthat's now again visible here 13964 13:27:19,332 --> 13:27:21,192 but we didn't call it out a second ago. 13965 13:27:21,192 --> 13:27:24,281 What else is different in Python\n 13966 13:27:24,993 --> 13:27:27,641 AUDIENCE: You don't have any\nparentheses around the condition. 13967 13:27:28,222 --> 13:27:30,612 We don't have any parentheses\naround the condition 13968 13:27:30,612 --> 13:27:32,232 the Boolean expression itself. 13969 13:27:33,088 --> 13:27:34,421 Well, it's just simpler to type. 13970 13:27:35,472 --> 13:27:36,972 You can still use parentheses. 13971 13:27:36,972 --> 13:27:39,072 And, in fact, you might\nwant to or need to 13972 13:27:39,072 --> 13:27:43,991 if you want to combine thoughts and\n 13973 13:27:43,991 --> 13:27:47,441 But by default, you no longer need\n 13974 13:27:48,671 --> 13:27:50,961 Lastly, with conditionals,\nwe had something like this 13975 13:27:50,961 --> 13:27:53,292 an if else if else statement. 13976 13:27:53,292 --> 13:27:55,362 In C, it looked a little\nsomething like this. 13977 13:27:55,362 --> 13:27:57,402 In Python, it's going to\nget really tighter now. 13978 13:27:57,402 --> 13:28:02,351 It's just if, and this is the\ncuriosity, elif x greater than y. 13979 13:28:02,351 --> 13:28:07,631 So it's not else if, it's literally\n 13980 13:28:07,631 --> 13:28:09,836 remain now on each of the three lines. 13981 13:28:09,836 --> 13:28:11,211 But the indentation is important. 13982 13:28:11,211 --> 13:28:13,002 And if we did want to\ndo multiple things 13983 13:28:13,002 --> 13:28:16,760 we could just indent below each\nof these conditionals, as well. 13984 13:28:16,760 --> 13:28:18,552 All right, let me pause\nthere first, to see 13985 13:28:18,552 --> 13:28:21,012 if there's any questions on\nthese syntactic differences. 13986 13:28:21,769 --> 13:28:24,054 AUDIENCE: My thought is\nmaybe like, it's good 13987 13:28:24,053 --> 13:28:27,681 though, does it matter if there's\n 13988 13:28:28,692 --> 13:28:31,572 DAVID J. MALAN: In between,\nbetween what and what? 13989 13:28:31,572 --> 13:28:34,942 AUDIENCE: So like the left-hand\n 13990 13:28:34,942 --> 13:28:38,352 DAVID J. MALAN: Ah, good\nquestion, is Python sensitive 13991 13:28:38,351 --> 13:28:40,271 to spaces and where they go? 13992 13:28:40,271 --> 13:28:42,911 Sometimes no, sometimes\nyes, is the short answer. 13993 13:28:42,911 --> 13:28:46,601 Stylistically, though, you should be\n 13994 13:28:46,601 --> 13:28:50,786 whereby you do have spaces to the\n 13995 13:28:50,786 --> 13:28:52,661 that they're called,\nsomething like less than 13996 13:28:52,661 --> 13:28:54,869 or greater than is a binary\noperator, because there's 13997 13:28:54,870 --> 13:28:57,102 two operands to the left\nand to the right of them. 13998 13:28:57,101 --> 13:29:00,161 And in fact, in Python,\nmore so than the world of C 13999 13:29:00,161 --> 13:29:02,861 there's actually formal\nstyle conventions. 14000 13:29:02,862 --> 13:29:07,209 Not only within CS50 have we had a\n 14001 13:29:07,209 --> 13:29:10,542 for instance, that just dictates how you\n 14002 13:29:11,466 --> 13:29:13,841 In the Python community, they\ntake this one step further 14003 13:29:13,841 --> 13:29:17,781 and there's an actual standard whereby\n 14004 13:29:17,781 --> 13:29:20,832 but generally speaking, in the real\n 14005 13:29:20,832 --> 13:29:23,622 would reject your code, if you're trying\n 14006 13:29:23,622 --> 13:29:25,252 if you don't adhere to these standards. 14007 13:29:25,252 --> 13:29:28,211 So while you could be lax\nwith some of this white space 14008 13:29:29,381 --> 13:29:33,296 And that's Python theme, for the\n 14009 13:29:33,296 --> 13:29:35,921 All right, so let's take a look\nat a couple of other constructs 14010 13:29:35,921 --> 13:29:37,881 before transitioning\nto some actual code. 14011 13:29:37,881 --> 13:29:40,631 This, of course, in Scratch\nwas a loop, meowing forever. 14012 13:29:40,631 --> 13:29:44,862 In C, the closest we could get was\n 14013 13:29:45,622 --> 13:29:48,582 So it's sort of a simple way\nof just saying do this forever. 14014 13:29:48,582 --> 13:29:51,461 In Python, it's pretty\nmuch the same thing 14015 13:29:51,461 --> 13:29:53,262 but a couple of small differences here. 14016 13:29:57,161 --> 13:30:00,784 No semicolon, and there's\none other subtle difference. 14017 13:30:01,451 --> 13:30:02,441 AUDIENCE: True is capitalized? 14018 13:30:02,442 --> 13:30:04,525 DAVID J. MALAN: True is\ncapitalized, just because. 14019 13:30:04,525 --> 13:30:07,092 Both true and false are\nBoolean values in Python. 14020 13:30:07,091 --> 13:30:09,671 But you've got to start\ncapitalizing them, just because. 14021 13:30:09,671 --> 13:30:11,561 All right, how about a\nloop like this, where 14022 13:30:11,561 --> 13:30:14,981 you repeat something a finite number\n 14023 13:30:14,982 --> 13:30:17,572 In C, we could do this\na few different ways. 14024 13:30:17,572 --> 13:30:21,311 There's this very mechanical way,\n 14025 13:30:22,091 --> 13:30:25,871 You then use a while loop and\ncheck if i is less than 3 14026 13:30:25,872 --> 13:30:27,709 the total number of\ntimes you want to meow. 14027 13:30:27,709 --> 13:30:29,292 Then you print what you want to print. 14028 13:30:29,292 --> 13:30:32,891 You increment i using this syntax,\n 14029 13:30:32,891 --> 13:30:34,402 with plus equals or whatnot. 14030 13:30:34,402 --> 13:30:36,732 And then you do it again\nand again and again. 14031 13:30:36,732 --> 13:30:40,692 In Python, you can do it\nfunctionally the same way, same idea 14032 13:30:42,101 --> 13:30:44,711 You just don't bother saying\nwhat type of variable you want. 14033 13:30:44,711 --> 13:30:47,559 Python will infer from the fact\nthat there's a 0 right there. 14034 13:30:47,559 --> 13:30:48,851 You don't need the parentheses. 14035 13:30:51,281 --> 13:30:54,432 You can't do the i plus plus, but\n 14036 13:30:54,432 --> 13:30:56,622 as we could have done in C, as well. 14037 13:30:56,622 --> 13:30:58,842 How else might we do this, though, too? 14038 13:30:58,841 --> 13:31:01,061 Well. it turns out in\nC, we could do something 14039 13:31:01,061 --> 13:31:04,752 like this, which, again, sort\nof cryptic at first glance 14040 13:31:04,752 --> 13:31:07,692 became perhaps more familiar,\nwhere you have initialization 14041 13:31:07,692 --> 13:31:11,442 a conditional, and then an update\n 14042 13:31:11,442 --> 13:31:14,472 In Python, there isn't really an analog. 14043 13:31:14,472 --> 13:31:17,022 There is no analog in\nPython, where you have 14044 13:31:17,021 --> 13:31:19,901 the parentheses and the multiple\nsemicolons in the same line. 14045 13:31:19,902 --> 13:31:23,531 Instead, there is a for loop, but\n 14046 13:31:23,531 --> 13:31:27,072 like English, for i in 0, 1, and 2. 14047 13:31:27,072 --> 13:31:31,302 So we'll see in a bit, these square\n 14048 13:31:31,302 --> 13:31:33,612 to be called a list in Python. 14049 13:31:33,612 --> 13:31:37,811 So lists in Python are more like\n 14050 13:31:38,902 --> 13:31:42,732 So this just means for i and the\nfollowing list of three values. 14051 13:31:42,732 --> 13:31:46,342 And on each iteration of this loop,\n 14052 13:31:49,362 --> 13:31:54,402 Then it sets i to two, so that you\n 14053 13:31:54,402 --> 13:31:57,972 But this doesn't necessarily scale,\n 14054 13:31:57,972 --> 13:32:01,662 Suppose you took this\nat face value as the way 14055 13:32:01,661 --> 13:32:05,501 you iterate some number of times\nin Python, using a for loop. 14056 13:32:05,502 --> 13:32:10,004 At what point does this approach\nperhaps get bad, or bad design? 14057 13:32:10,004 --> 13:32:11,711 Let me give folks just\na moment to think. 14058 13:32:12,936 --> 13:32:15,603 AUDIENCE: If you don't know how\nmany times, last time, you know 14059 13:32:15,603 --> 13:32:17,604 you've got the link in there. 14060 13:32:17,605 --> 13:32:20,022 DAVID J. MALAN: Sure, if you\ndon't know how many times you 14061 13:32:20,021 --> 13:32:23,981 want to loop or iterate, you can't\n 14062 13:32:26,845 --> 13:32:29,512 AUDIENCE: So you want to say raise\na large number of allowances. 14063 13:32:29,512 --> 13:32:32,262 DAVID J. MALAN: Yeah, if you're\n 14064 13:32:32,262 --> 13:32:34,162 this list is going to\nget longer and longer 14065 13:32:34,161 --> 13:32:36,453 and you're just kind of\nstupidly going to be typing out 14066 13:32:36,453 --> 13:32:40,182 like comma 3, comma 4, comma 5, comma\n 14067 13:32:40,182 --> 13:32:42,682 I mean, your code would start\nto look atrocious, eventually. 14068 13:32:44,031 --> 13:32:46,881 In Python, there is a function,\nor technically a type 14069 13:32:46,881 --> 13:32:51,052 called range, that essentially magically\n 14070 13:32:51,052 --> 13:32:54,121 from 0 on up to, but\nnot through a value. 14071 13:32:54,120 --> 13:32:58,131 So the effect of this line of\n 14072 13:32:58,131 --> 13:33:01,006 essentially hands you back\na list of three values 14073 13:33:01,006 --> 13:33:02,881 thereby letting you do\nsomething three times. 14074 13:33:02,881 --> 13:33:05,588 And if you want to do something\n 14075 13:33:07,597 --> 13:33:11,612 AUDIENCE: Is there a way to start\n 14076 13:33:11,612 --> 13:33:15,932 at a number or an integer that's higher\n 14077 13:33:16,982 --> 13:33:18,062 DAVID J. MALAN: A really\ngood question, can 14078 13:33:18,061 --> 13:33:19,961 you start counting at a higher number. 14079 13:33:19,961 --> 13:33:23,432 So not 0, which is the implied default,\n 14080 13:33:23,432 --> 13:33:28,082 Yes, so it turns out the range function\n 14081 13:33:28,082 --> 13:33:31,520 but maybe two or even three, that\n 14082 13:33:31,519 --> 13:33:33,061 So you can customize where it begins. 14083 13:33:33,061 --> 13:33:34,441 You can customize the increment. 14084 13:33:34,442 --> 13:33:36,234 By default, it's one,\nbut if you want to do 14085 13:33:36,233 --> 13:33:39,103 every two values, for like evens\n 14086 13:33:40,061 --> 13:33:42,451 And before long, we'll take a\nlook at some Python documentation 14087 13:33:42,451 --> 13:33:45,331 that will become your authoritative\n 14088 13:33:45,332 --> 13:33:47,312 Like, what can this function do. 14089 13:33:47,311 --> 13:33:51,542 Other questions on this thus far? 14090 13:33:51,542 --> 13:33:56,502 Seeing none, so what else might\nwe compare and contrast here. 14091 13:33:56,502 --> 13:34:00,841 Well, in the world of C, recall that\n 14092 13:34:00,841 --> 13:34:04,831 types, like these here, Bool and char\n 14093 13:34:04,832 --> 13:34:08,192 string, which happened to\ncome from the CS50 library. 14094 13:34:08,192 --> 13:34:12,512 But the language C itself certainly\n 14095 13:34:12,512 --> 13:34:17,222 because the backslash 0, the support\n 14096 13:34:17,222 --> 13:34:19,891 built into C, not a CS50 simplification. 14097 13:34:19,891 --> 13:34:22,141 All we did, and revealed,\nas of a couple of weeks 14098 13:34:22,141 --> 13:34:24,572 ago, is that string,\nthis data type, is just 14099 13:34:24,572 --> 13:34:29,252 a synonym for a typedef for char star,\n 14100 13:34:29,252 --> 13:34:32,131 In Python now, this list actually\n 14101 13:34:32,131 --> 13:34:33,964 for these common primitive data types. 14102 13:34:33,964 --> 13:34:36,631 Still going to have bulls, we're\ngoing to have floats, and Ints 14103 13:34:36,631 --> 13:34:39,121 and we're going to have strings,\n 14104 13:34:39,122 --> 13:34:41,282 And this is not a CS50\nthing from the library 14105 13:34:41,281 --> 13:34:44,822 STR, S-T-R, is, in fact,\na data type in Python 14106 13:34:44,822 --> 13:34:48,781 that's going to do a lot more than\n 14107 13:34:48,781 --> 13:34:53,654 Ints and floats, meanwhile, don't need\n 14108 13:34:53,654 --> 13:34:56,072 because, in fact, among the\nproblems Python solves for us 14109 13:34:56,072 --> 13:34:58,862 too, Ints can get as big as you want. 14110 13:34:58,862 --> 13:35:01,741 Integer overflow is no\nlonger going to be an issue. 14111 13:35:01,741 --> 13:35:04,472 Per week 1, the language\nsolves that for us. 14112 13:35:04,472 --> 13:35:06,311 Floating point\nimprecision, unfortunately 14113 13:35:06,311 --> 13:35:07,711 is still a problem that remains. 14114 13:35:07,711 --> 13:35:11,252 But there are libraries, code that\n 14115 13:35:11,252 --> 13:35:13,531 discussed in weeks past,\nthat allow you to do 14116 13:35:13,531 --> 13:35:16,771 scientific or financial computing,\nusing libraries that build 14117 13:35:16,771 --> 13:35:19,146 on top of these data types, as well. 14118 13:35:19,146 --> 13:35:22,021 So there's other data types, too,\n 14119 13:35:22,021 --> 13:35:25,231 gives us a whole bunch of\nmore power and capability 14120 13:35:25,232 --> 13:35:28,022 things called ranges,\nlike we just saw, lists 14121 13:35:28,021 --> 13:35:30,601 like I called out verbally,\nwith the square brackets 14122 13:35:30,601 --> 13:35:33,421 things called tuples, for\nthings like x comma y 14123 13:35:33,421 --> 13:35:36,826 or latitude, longitude,\ndictionaries, or Dicts 14124 13:35:36,826 --> 13:35:40,261 which allow you to store keys and\n 14125 13:35:40,262 --> 13:35:43,495 from last time, and then sets in the\n 14126 13:35:43,495 --> 13:35:46,412 out duplicates for you, and you can\n 14127 13:35:46,411 --> 13:35:50,431 a whole bunch of words or whatnot,\n 14128 13:35:50,432 --> 13:35:52,921 will filter out duplicates for you. 14129 13:35:52,921 --> 13:35:56,506 Now there's going to be a few functions\n 14130 13:35:56,506 --> 13:35:59,131 training wheels that we're then\ngoing to very quickly take off 14131 13:35:59,131 --> 13:36:02,581 just because, as we'll see today, they\n 14132 13:36:02,582 --> 13:36:05,726 user input correctly, without\naccidentally writing buggy code 14133 13:36:05,726 --> 13:36:08,851 just when you're trying to get Hello,\n 14134 13:36:08,851 --> 13:36:12,572 And we'll give you functions, not\n 14135 13:36:12,572 --> 13:36:15,152 but a subset of these,\nget float, get Int 14136 13:36:15,152 --> 13:36:18,182 and get string, that'll\nautomate the process of getting 14137 13:36:18,182 --> 13:36:21,932 user input in a way that's more\n 14138 13:36:21,932 --> 13:36:23,792 But we'll see what those bugs might be. 14139 13:36:23,792 --> 13:36:26,641 And the way we're going to do\nthis is similar in spirit to C. 14140 13:36:26,641 --> 13:36:30,902 Instead of doing include,\nCS50.h, like we did in C 14141 13:36:30,902 --> 13:36:33,812 you're going to now\nstart saying import CS50. 14142 13:36:33,811 --> 13:36:37,081 Python supports,\nsimilar to C, libraries 14143 13:36:37,082 --> 13:36:38,822 but there aren't header files anymore. 14144 13:36:38,822 --> 13:36:41,612 You just use the name of\nthe library in Python. 14145 13:36:41,612 --> 13:36:44,972 And if you want to import CS50's\n 14146 13:36:44,972 --> 13:36:48,991 Or, if you want to be more precise, and\n 14147 13:36:48,991 --> 13:36:52,381 could be slow, if you've got a really\n 14148 13:36:52,381 --> 13:36:56,252 in it, you can be more precise and\n 14149 13:36:56,252 --> 13:37:00,002 From CS50 import get Int,\nfrom CSM 50 import get string 14150 13:37:00,002 --> 13:37:02,792 or you can just separate\nthem by commas and import 3 14151 13:37:02,792 --> 13:37:07,072 and only 3 things from a\nparticular library, like ours. 14152 13:37:07,072 --> 13:37:08,822 But starting today and\nonward, we're going 14153 13:37:08,822 --> 13:37:11,972 to start making much more\nheavy use of libraries, code 14154 13:37:11,972 --> 13:37:15,092 that other people wrote, so that\n 14155 13:37:15,091 --> 13:37:18,396 We're not making our own linked lists,\n 14156 13:37:18,396 --> 13:37:20,771 We're going to start standing\non the shoulders of others 14157 13:37:20,771 --> 13:37:23,641 so that you can get real work\ndone, so to speak, faster 14158 13:37:23,641 --> 13:37:28,231 by building your software on\ntop of others' code as well. 14159 13:37:28,232 --> 13:37:31,632 All right, so that's it for the\nsyntactic tour of the language 14160 13:37:31,631 --> 13:37:32,881 and the sort of core features. 14161 13:37:32,881 --> 13:37:34,841 Soon we'll transition\nto application thereof. 14162 13:37:34,841 --> 13:37:40,561 But let me pause here to see if there's\n 14163 13:37:48,726 --> 13:37:52,685 AUDIENCE: Why don't Python\nhave the increment operators. 14164 13:37:52,684 --> 13:37:54,851 DAVID J. MALAN: I'm sorry,\nsay it again, why doesn't 14165 13:37:54,851 --> 13:37:56,309 Python have what kind of operators? 14166 13:37:56,309 --> 13:37:59,099 AUDIENCE: Why doesn't Python\nhave the increment operator? 14167 13:37:59,099 --> 13:38:02,141 DAVID J. MALAN: Sorry, someone coughed\n 14168 13:38:03,470 --> 13:38:05,262 DAVID J. MALAN: Oh,\nthe increment operator? 14169 13:38:05,262 --> 13:38:06,929 I'd have to check the history, honestly. 14170 13:38:06,928 --> 13:38:09,431 Python has tended to be a\nfairly minimus language. 14171 13:38:09,432 --> 13:38:12,612 And if you can do something one\nway, the community, arguably 14172 13:38:12,612 --> 13:38:16,667 has tended to not give you multiple\n 14173 13:38:16,667 --> 13:38:18,042 There's probably a better answer. 14174 13:38:18,042 --> 13:38:22,362 And I'll see if I can dig in and post\n 14175 13:38:22,362 --> 13:38:26,391 All right, so before we transition\n 14176 13:38:26,391 --> 13:38:31,391 let me go ahead and consider exactly\n 14177 13:38:31,391 --> 13:38:35,292 In the world of C, recall that it's\n 14178 13:38:35,292 --> 13:38:40,752 We create a file called like Hello.c,\n 14179 13:38:41,921 --> 13:38:44,652 Or, if you think back to week\ntwo, when we sort of peeled back 14180 13:38:44,652 --> 13:38:47,622 the layer of what Hello,\nof what make was doing 14181 13:38:47,622 --> 13:38:50,832 you could more verbosely type out\n 14182 13:38:50,832 --> 13:38:54,162 Clang in our case, command line\narguments like dash Oh, Hello 14183 13:38:54,161 --> 13:38:56,361 to specify what name you want to create. 14184 13:38:56,362 --> 13:38:58,182 And then you can specify the file name. 14185 13:38:58,182 --> 13:39:01,572 And then you can specify what\nlibraries you want to link in. 14186 13:39:01,572 --> 13:39:03,072 So that was a very verbose approach. 14187 13:39:03,072 --> 13:39:05,451 But it was always a two-step approach. 14188 13:39:05,451 --> 13:39:08,201 And so, even as you've been\ndoing recent problem sets 14189 13:39:08,201 --> 13:39:11,921 odds are you've realized that, any time\n 14190 13:39:11,921 --> 13:39:16,182 or make a change to your code\nand try and test your code again 14191 13:39:16,182 --> 13:39:18,881 you're constantly doing those two steps. 14192 13:39:18,881 --> 13:39:22,362 Moving forward in Python,\nit's going to become simpler 14193 13:39:22,362 --> 13:39:24,131 and it's going to be just this. 14194 13:39:24,131 --> 13:39:26,981 The file name is going to change,\n 14195 13:39:26,982 --> 13:39:31,782 It's going to be something like\n 14196 13:39:31,781 --> 13:39:34,512 And that's just a convention,\nusing a different file extension. 14197 13:39:34,512 --> 13:39:37,302 But there's no compilation step per se. 14198 13:39:37,302 --> 13:39:40,692 You jump right to the\nexecution of your code. 14199 13:39:40,692 --> 13:39:43,722 And so Python, it turns out, is\n 14200 13:39:43,722 --> 13:39:48,671 we're going to start using, it's also\n 14201 13:39:48,671 --> 13:39:52,542 assuming it's been pre-installed,\n 14202 13:39:52,542 --> 13:39:56,622 This is to say that Python is generally\n 14203 13:39:57,881 --> 13:40:01,691 And by that, I mean you get to skip,\n 14204 13:40:02,891 --> 13:40:07,391 There is no manual step in the world of\n 14205 13:40:07,391 --> 13:40:11,052 and then compiling it to zeros and ones,\n 14206 13:40:11,052 --> 13:40:13,391 Instead, these kind of\ntwo steps get collapsed 14207 13:40:13,391 --> 13:40:19,091 into the illusion of one, whereby you,\n 14208 13:40:19,091 --> 13:40:22,721 and let the computer figure\nout how to actually convert it 14209 13:40:22,722 --> 13:40:24,762 to something the computer understands. 14210 13:40:24,762 --> 13:40:28,372 And the way we do that is via this\n 14211 13:40:28,372 --> 13:40:30,432 But now, when you have\nsource code, it's going 14212 13:40:30,432 --> 13:40:33,372 to be passed into an\ninterpreter, not a compiler. 14213 13:40:33,372 --> 13:40:35,922 And the best analog of this\nis just to perhaps point out 14214 13:40:35,921 --> 13:40:38,472 that, in the human world, if\nyou speak, or don't speak 14215 13:40:38,472 --> 13:40:42,162 multiple human languages, it can\n 14216 13:40:42,161 --> 13:40:43,791 from one language to another. 14217 13:40:43,792 --> 13:40:46,692 For instance, here are step-by-step\n 14218 13:40:46,692 --> 13:40:49,062 in a phone book,\nunfortunately, in Spanish. 14219 13:40:49,061 --> 13:40:51,881 Unfortunately, if you don't\nspeak or read Spanish. 14220 13:40:53,082 --> 13:40:55,902 You could run this algorithm, but you're\n 14221 13:40:55,902 --> 13:40:58,652 or you're going to have to open\n 14222 13:40:58,652 --> 13:40:59,982 to English and convert this. 14223 13:40:59,982 --> 13:41:03,582 And the catch with translating\nany language, human or computer 14224 13:41:03,582 --> 13:41:07,372 or otherwise, is that you're going\n 14225 13:41:07,372 --> 13:41:10,362 And so converting this in\nSpanish to this in English 14226 13:41:10,362 --> 13:41:12,881 is just going to take you\nlonger than if this were already 14227 13:41:14,974 --> 13:41:17,891 And that's going to be one of the\n 14228 13:41:17,891 --> 13:41:21,701 Yes, it's a feature that you can\n 14229 13:41:21,701 --> 13:41:24,401 to bother compiling it manually first. 14230 13:41:25,572 --> 13:41:27,336 And things might be a little slower. 14231 13:41:27,336 --> 13:41:28,961 Now, there's ways to chip away at that. 14232 13:41:28,961 --> 13:41:30,336 But we'll see an example thereof. 14233 13:41:30,336 --> 13:41:33,222 In fact, let me transition now\nto just a couple of examples 14234 13:41:33,222 --> 13:41:37,182 that demonstrate how Python is\nnot only easier for many people 14235 13:41:37,182 --> 13:41:39,762 to use, perhaps yourselves\ntoo, because it throws away 14236 13:41:39,762 --> 13:41:42,641 a lot of the annoying syntax,\nit shortens the number of lines 14237 13:41:42,641 --> 13:41:46,332 you have to write, and also it\n 14238 13:41:46,332 --> 13:41:51,262 you can just do so much more without\n 14239 13:41:51,262 --> 13:41:54,192 So, as an example of this,\nlet me switch over here 14240 13:41:54,192 --> 13:42:00,612 to this image from problem set 4, which\n 14241 13:42:01,811 --> 13:42:03,766 And this is the original\nphoto, pretty clear 14242 13:42:03,766 --> 13:42:06,891 and it's even higher res if we looked\n 14243 13:42:06,891 --> 13:42:10,182 But there have been no filters, a\n 14244 13:42:10,182 --> 13:42:13,271 Recall, for problem set four, you\n 14245 13:42:13,271 --> 13:42:14,981 And among them might have been blur. 14246 13:42:14,982 --> 13:42:18,132 And blur was probably among the\nmore challenging of the ones 14247 13:42:18,131 --> 13:42:20,711 because you had to iterate\nover all of the pixels 14248 13:42:20,711 --> 13:42:23,652 you had to take into account what's\n 14249 13:42:24,012 --> 13:42:25,970 I mean, there was a lot\nof math and arithmetic. 14250 13:42:25,970 --> 13:42:29,141 And if you ultimately got it, it was\n 14251 13:42:29,141 --> 13:42:31,302 But that was probably\nseveral hours later. 14252 13:42:31,302 --> 13:42:34,062 In a language like\nPython, where there might 14253 13:42:34,061 --> 13:42:37,691 be libraries that had been written\nby others, on whose shoulders 14254 13:42:37,692 --> 13:42:40,402 you can stand, we could\nperhaps do something like this. 14255 13:42:40,402 --> 13:42:44,802 Let me go ahead and run a program, or\n 14256 13:42:44,802 --> 13:42:48,652 And in Blur.py, in VS\nCode, let me just do this. 14257 13:42:48,652 --> 13:42:51,891 Let me import from a library,\nnot the CS50 library 14258 13:42:51,891 --> 13:42:56,141 but the Pillow library, so to\nspeak, a keyword called image 14259 13:42:56,141 --> 13:42:59,851 and another one called image\nfilter, then let me go ahead 14260 13:42:59,851 --> 13:43:02,941 and say, let me open the current\nversion of this image, which 14261 13:43:04,262 --> 13:43:06,781 So the before version\nof the image will be 14262 13:43:06,781 --> 13:43:11,072 the result of calling image.open\nquote unquote "Bridge.bmp 14263 13:43:11,072 --> 13:43:13,561 and then, let me create\nan after version. 14264 13:43:13,561 --> 13:43:15,362 So you'll see before and after. 14265 13:43:15,362 --> 13:43:21,531 After equals the before version\n.filter of image filter. 14266 13:43:21,531 --> 13:43:23,281 And there is, if I\nread the documentation 14267 13:43:23,281 --> 13:43:25,574 I'll see that there's something\ncalled a box blur, that 14268 13:43:25,574 --> 13:43:28,682 allows you to blur in box\nformat, like one pixel above 14269 13:43:30,271 --> 13:43:31,888 So I'll do one pixel there. 14270 13:43:31,889 --> 13:43:34,472 And then, after that's done, let\nme go ahead and save the file 14271 13:43:38,701 --> 13:43:41,432 Assuming this library\nworks as described 14272 13:43:41,432 --> 13:43:44,582 I am opening the file\nin Python, using line 3. 14273 13:43:44,582 --> 13:43:46,202 And this is somewhat new syntax. 14274 13:43:46,201 --> 13:43:49,771 In the world of Python, we're going to\n 14275 13:43:49,771 --> 13:43:51,841 more, because in the\nworld of Python, you have 14276 13:43:51,841 --> 13:43:56,221 what's called object-oriented\n 14277 13:43:56,222 --> 13:43:58,991 And what this means is that\nyou still have functions 14278 13:43:58,991 --> 13:44:01,502 you still have variables,\nbut sometimes those functions 14279 13:44:01,502 --> 13:44:05,372 are embedded inside of the\nvariables, or, more specifically 14280 13:44:05,372 --> 13:44:07,232 inside of the data types themselves. 14281 13:44:07,232 --> 13:44:10,952 Think back to C. When you wanted\n 14282 13:44:10,951 --> 13:44:15,103 there was a to upper function that takes\n 14283 13:44:15,103 --> 13:44:18,061 And you can pass in any char you\n 14284 13:44:19,411 --> 13:44:22,681 Well, you know what, if that's\nsuch a common paradigm, where 14285 13:44:22,682 --> 13:44:26,372 upper-casing chars is a useful\n 14286 13:44:26,372 --> 13:44:30,992 is it embeds into the string\ndata type, or char if you will 14287 13:44:30,991 --> 13:44:35,761 the ability just to uppercase any char\n 14288 13:44:35,762 --> 13:44:38,672 as though it's a struct\nin C. Recall that structs 14289 13:44:38,671 --> 13:44:40,921 encapsulate multiple types of values. 14290 13:44:40,921 --> 13:44:44,131 In object-oriented programming,\nin a language like Python 14291 13:44:44,131 --> 13:44:48,031 you can encapsulate not just\nvalues, but also functionality. 14292 13:44:48,031 --> 13:44:50,339 Functions can now be inside of structs. 14293 13:44:50,339 --> 13:44:52,381 But we're not going to\ncall them structs anymore. 14294 13:44:52,381 --> 13:44:53,792 We're going to call them objects. 14295 13:44:53,792 --> 13:44:55,652 But that's just a different vernacular. 14296 13:44:57,391 --> 13:45:00,391 Inside of the image library,\nthere's a function called open 14297 13:45:00,391 --> 13:45:03,152 and it takes an argument, the\nname of the file, to open. 14298 13:45:03,152 --> 13:45:06,781 Once I have a variable called before,\n 14299 13:45:06,781 --> 13:45:09,811 an object, inside of\nwhich is now, because it 14300 13:45:09,811 --> 13:45:12,661 was returned from this\nfunction, a function 14301 13:45:12,661 --> 13:45:14,801 called filter, that takes an argument. 14302 13:45:14,802 --> 13:45:18,182 The argument here happens\nto be image.boxblur1 14303 13:45:19,351 --> 13:45:21,324 But it just returns the filter to use. 14304 13:45:21,324 --> 13:45:23,491 And then, after, dot save\ndoes what you might think. 14305 13:45:24,671 --> 13:45:27,991 So instead of using fopen and\nfwrite, you just say dot save 14306 13:45:27,991 --> 13:45:31,031 and that does all of\nthat messy work for you. 14307 13:45:31,031 --> 13:45:33,752 So it's just, what, four\nlines of code total? 14308 13:45:33,752 --> 13:45:36,762 Let me go ahead and go\ndown to my terminal window. 14309 13:45:36,762 --> 13:45:40,055 Let me go ahead and show you\nwith LS that, at the moment 14310 13:45:40,055 --> 13:45:41,972 whoops, sorry, let me\nnot bother showing that 14311 13:45:41,972 --> 13:45:43,682 because I have other examples to come. 14312 13:45:43,682 --> 13:45:50,832 I'm going to go ahead and do Python\n 14313 13:45:50,832 --> 13:45:52,092 I did need to make a command. 14314 13:45:52,802 --> 13:45:55,862 OK, let me go ahead and type LS\n 14315 13:45:55,862 --> 13:45:58,082 is among the sample code online today. 14316 13:45:58,082 --> 13:46:01,322 There's only one file\ncalled Bridge.bmp, dammit 14317 13:46:01,322 --> 13:46:04,152 I'm trying to get these\nthings ready at the same time. 14318 13:46:05,252 --> 13:46:08,641 Let me move this code into place. 14319 13:46:08,641 --> 13:46:11,231 All right, I've gone ahead\nand moved this file, Blur.py 14320 13:46:11,232 --> 13:46:13,712 into a folder called\nfilter, inside of which 14321 13:46:13,711 --> 13:46:18,601 there's another file called Bridge.bmp,\n 14322 13:46:18,601 --> 13:46:20,911 Let me now go ahead\nand run Python, which 14323 13:46:20,911 --> 13:46:23,221 is my interpreter, and also\nthe name of the language 14324 13:46:23,222 --> 13:46:25,512 and run Python on this file. 14325 13:46:25,512 --> 13:46:27,870 So much like running\nthe Spanish algorithm 14326 13:46:27,870 --> 13:46:29,912 through Google Translate,\nor something like that 14327 13:46:29,911 --> 13:46:32,171 as input, to get back\nthe English output 14328 13:46:32,171 --> 13:46:36,061 this is going to translate the\nPython language to something 14329 13:46:36,061 --> 13:46:38,281 this computer, or this\ncloud-based environment 14330 13:46:38,281 --> 13:46:41,591 understands, and then run the\ncorresponding code, top to bottom 14331 13:46:42,228 --> 13:46:43,561 I'm going to go ahead and Enter. 14332 13:46:43,561 --> 13:46:45,451 No error message is\ngenerally a good thing. 14333 13:46:45,451 --> 13:46:48,481 If I type LS you'll now see out.bmp. 14334 13:46:48,482 --> 13:46:49,817 Let me go ahead and open that. 14335 13:46:49,817 --> 13:46:52,442 And, you know what, just to make\nclear what's really happening 14336 13:46:52,442 --> 13:46:53,609 let me blur it even further. 14337 13:46:53,608 --> 13:46:57,072 Let's make a box that's not\njust one pixel around, but 10. 14338 13:46:58,472 --> 13:47:01,351 And let me just go ahead and\nrerun it with Python of Blur.py. 14339 13:47:03,841 --> 13:47:08,621 Let me go ahead and open Out.bmp\nand show you first the before 14340 13:47:11,072 --> 13:47:14,341 And now, crossing my fingers,\nfour lines of code later 14341 13:47:14,341 --> 13:47:16,279 the result of blurring it, as well. 14342 13:47:16,279 --> 13:47:18,572 So the library is doing all\nof the same kind of legwork 14343 13:47:18,572 --> 13:47:20,641 that you all did for\nthe assignment, but it's 14344 13:47:20,641 --> 13:47:24,824 encapsulated it all into a single\n 14345 13:47:24,824 --> 13:47:27,241 Those of you who might have\nbeen feeling more comfortable 14346 13:47:27,241 --> 13:47:29,116 might have done a little\nsomething like this. 14347 13:47:29,116 --> 13:47:33,421 Let me go ahead and open up one\nother file, called Edges.py. 14348 13:47:33,421 --> 13:47:36,811 And in Edges.py, I'm again going\n 14349 13:47:36,811 --> 13:47:39,531 the image keyword, and the image filter. 14350 13:47:39,531 --> 13:47:42,031 Then I'm going to go ahead and\ncreate a before image, that's 14351 13:47:42,031 --> 13:47:46,112 a result of calling image.open\nof the same thing, Bridge.bmp 14352 13:47:46,112 --> 13:47:53,432 then I'm going to go ahead and run a\n 14353 13:47:53,432 --> 13:47:58,372 image filter.find edges, which\nis like a content, if you will 14354 13:47:58,372 --> 13:48:00,230 defined inside of this library for us. 14355 13:48:00,230 --> 13:48:02,272 And then I'm going to do\nafter.save quote unquote 14356 13:48:02,271 --> 13:48:04,731 Out.bmp," using the same file name. 14357 13:48:04,732 --> 13:48:13,012 I'm now going to run Python of\n 14358 13:48:13,012 --> 13:48:15,452 We'll see what syntax error means soon. 14359 13:48:15,451 --> 13:48:17,991 Let me go ahead and run\nthe code now, Edges.py. 14360 13:48:17,991 --> 13:48:21,351 Let me now open that new file, Out.bmp. 14361 13:48:21,351 --> 13:48:26,031 And before we had this, and now,\n 14362 13:48:26,031 --> 13:48:28,731 if we did the more comfortable\nversion of P set 4 14363 13:48:28,732 --> 13:48:31,862 we now get this, after\njust four lines of code. 14364 13:48:31,862 --> 13:48:34,641 So again, suggesting the power\nof using a language that's better 14365 13:48:34,641 --> 13:48:36,082 optimized for the tool at hand. 14366 13:48:36,082 --> 13:48:39,472 And at the risk of really\nmaking folks sad, let's go ahead 14367 13:48:39,472 --> 13:48:43,342 and re-implement, if we could,\n 14368 13:48:43,341 --> 13:48:47,601 Let me go ahead and open\nanother version of this code 14369 13:48:47,601 --> 13:48:50,828 wherein I have a C\nversion, just from problem 14370 13:48:50,828 --> 13:48:52,911 set five, wherein you\nimplemented a spell checker 14371 13:48:52,911 --> 13:48:55,161 loading 100,000 plus words into memory. 14372 13:48:55,161 --> 13:48:58,911 And then you kept track of just\n 14373 13:48:58,911 --> 13:49:00,861 And that probably took\na while, implementing 14374 13:49:00,862 --> 13:49:03,052 all of those functions in Dictionary.c. 14375 13:49:03,052 --> 13:49:08,762 Let me instead now go into a\nnew file, called Dictionary.py. 14376 13:49:08,762 --> 13:49:11,722 And let me stipulate, for\nthe sake of discussion 14377 13:49:11,722 --> 13:49:14,182 that we already wrote\nin advance, Speller.py 14378 13:49:14,182 --> 13:49:16,372 which corresponds to Speller.c. 14379 13:49:16,372 --> 13:49:17,902 You didn't write either of those. 14380 13:49:17,902 --> 13:49:20,122 Recall for problem set\nfive, we gave you Speller.c. 14381 13:49:20,122 --> 13:49:22,080 Assume that we're going\nto give you Speller.py. 14382 13:49:22,080 --> 13:49:28,552 So the onus on us right now is only\n 14383 13:49:28,552 --> 13:49:31,462 All right, so I'm going to go\nahead and define a few functions. 14384 13:49:31,461 --> 13:49:34,521 And we're going to see now the syntax\n 14385 13:49:34,521 --> 13:49:38,752 I want to go ahead and define\nfirst, a hash table, which 14386 13:49:38,752 --> 13:49:41,362 was the very first thing\nyou defined in Dictionary.c. 14387 13:49:41,362 --> 13:49:46,491 I'm going to go ahead, then, and say\n 14388 13:49:46,491 --> 13:49:48,205 otherwise known as a hash table. 14389 13:49:48,205 --> 13:49:50,122 All right, now let me\ndefine a function called 14390 13:49:50,122 --> 13:49:53,152 check, which was the first function\nyou might have implemented. 14391 13:49:53,152 --> 13:49:55,522 Check is going to take a word,\nand you'll see in Python 14392 13:49:55,521 --> 13:49:56,896 the syntax is a little different. 14393 13:49:56,896 --> 13:49:58,401 You don't specify the return type. 14394 13:49:58,402 --> 13:50:01,132 You use the word Def instead to define. 14395 13:50:01,131 --> 13:50:05,061 You still specify the name of the\n 14396 13:50:05,061 --> 13:50:07,731 But you omit any mention of types. 14397 13:50:07,732 --> 13:50:09,802 But you do use a colon and indent. 14398 13:50:09,802 --> 13:50:14,302 So how do I check if a word is in\n 14399 13:50:14,302 --> 13:50:17,962 Well, in Python, I can\njust say, if word in words 14400 13:50:17,961 --> 13:50:23,091 go ahead and return true, else\ngo ahead and return false, done 14401 13:50:24,470 --> 13:50:26,161 All right, now I want to do like load. 14402 13:50:26,161 --> 13:50:29,161 That was the heavy lift, where you\n 14403 13:50:29,161 --> 13:50:30,828 So let me define a function called load. 14404 13:50:30,828 --> 13:50:33,171 It takes a string, the\nname of a file to load. 14405 13:50:33,171 --> 13:50:36,502 So I'll call that Dictionary,\njust like in C, but no data type. 14406 13:50:36,502 --> 13:50:40,701 Let me go ahead and open a file by\n 14407 13:50:40,701 --> 13:50:43,261 by opening that Dictionary in read mode. 14408 13:50:43,262 --> 13:50:46,882 So this is a little similar to fopen,\n 14409 13:50:46,881 --> 13:50:49,401 Then let me iterate over\nevery line in the file. 14410 13:50:49,402 --> 13:50:54,322 In Python, this is pretty pleasant,\n 14411 13:50:54,322 --> 13:50:59,031 How, now, do I get at the current\n 14412 13:50:59,031 --> 13:51:02,091 because in this file of\nwords, 140,000 words 14413 13:51:02,091 --> 13:51:05,273 there's word backslash n,\nword backslash n, all right? 14414 13:51:05,273 --> 13:51:07,731 Well, let me go ahead and get\na word from the current line 14415 13:51:07,732 --> 13:51:11,362 but strip off, from the right end\n 14416 13:51:11,362 --> 13:51:14,061 the Rstrip function\nin Python does for me. 14417 13:51:14,061 --> 13:51:18,891 Then let me go ahead and add to my\n 14418 13:51:19,552 --> 13:51:22,057 Let me go ahead and close\nthe file for good measure. 14419 13:51:22,057 --> 13:51:24,682 And then let me go ahead and\nreturn true, because all was well. 14420 13:51:24,682 --> 13:51:26,842 That's it for the load\nfunction in Python. 14421 13:51:26,841 --> 13:51:28,101 How about the size function? 14422 13:51:28,101 --> 13:51:31,341 This did not take any arguments, it\n 14423 13:51:32,512 --> 13:51:36,502 I can do that by returning the\n 14424 13:51:36,502 --> 13:51:41,182 And then lastly, gone from the\n 14425 13:51:42,612 --> 13:51:45,472 So no matter what I do,\nthere's nothing to unload. 14426 13:51:45,472 --> 13:51:47,342 The computer will do that for me. 14427 13:51:47,341 --> 13:51:51,381 So I give you, in these functions,\nproblem set five in Python. 14428 13:51:51,381 --> 13:51:53,542 So, I'm sorry, we made\nyou write it in C first. 14429 13:51:53,542 --> 13:51:57,141 But the implication now is that,\nwhat are you getting for free 14430 13:51:58,372 --> 13:52:00,891 Well, encapsulated in\nthis one line of code 14431 13:52:00,891 --> 13:52:04,792 is much of what you wrote for\nproblem set five, implementing 14432 13:52:04,792 --> 13:52:07,792 your array for all of your\nletters of the alphabet or more 14433 13:52:07,792 --> 13:52:10,912 all of the linked lists that you\nimplemented to create chains 14434 13:52:10,911 --> 13:52:12,451 to store all of those words. 14435 13:52:13,582 --> 13:52:16,612 It's just someone else in the\nworld wrote that code for you. 14436 13:52:16,612 --> 13:52:19,582 And you can now use it\nby way of a dictionary. 14437 13:52:19,582 --> 13:52:22,072 And actually, I can\nchange this a little bit 14438 13:52:22,072 --> 13:52:25,192 because add is technically not\nthe right function to use here. 14439 13:52:25,192 --> 13:52:28,141 I'm actually treating the dictionary\n 14440 13:52:28,141 --> 13:52:31,942 So I'm going to make one tweak, set\n 14441 13:52:31,942 --> 13:52:34,222 But set just allows it\nto handle duplicates 14442 13:52:34,222 --> 13:52:36,952 and it allows me to just throw\nthings into it by literally 14443 13:52:36,951 --> 13:52:38,841 using a function as simple as add. 14444 13:52:38,841 --> 13:52:41,691 And I'm going to make\none other tweak here 14445 13:52:41,692 --> 13:52:46,312 because, when I'm checking a word,\n 14446 13:52:46,311 --> 13:52:49,042 to me in uppercase or capitalized. 14447 13:52:49,042 --> 13:52:52,402 It's not going to necessarily come\n 14448 13:52:53,991 --> 13:52:58,911 I can force every word to\nlowercase by using word.lower. 14449 13:52:58,911 --> 13:53:01,021 And I don't have to do it\ncharacter for character 14450 13:53:01,021 --> 13:53:06,322 I can do the whole darn string at\n 14451 13:53:06,322 --> 13:53:09,381 All right, let me go ahead and\nopen up a terminal window here. 14452 13:53:09,381 --> 13:53:12,639 And let me go into, first,\nmy C version, on the left. 14453 13:53:12,639 --> 13:53:15,682 And actually I'm going to go ahead\n 14454 13:53:15,682 --> 13:53:20,529 And on the right, I'm going to go into\n 14455 13:53:20,529 --> 13:53:23,362 But it's also available online, if\n 14456 13:53:23,362 --> 13:53:26,692 I'm going to go ahead and\nmake speller in C on the left 14457 13:53:26,692 --> 13:53:28,792 and note that it takes\na moment to compile. 14458 13:53:28,792 --> 13:53:33,052 Then I'm going to be ready to\nrun speller of dictionaries 14459 13:53:33,052 --> 13:53:35,852 let's do like the Sherlock\nHolmes text, which is pretty big. 14460 13:53:35,851 --> 13:53:40,491 And then over here, let me get\nready to run Python of speller 14461 13:53:44,254 --> 13:53:46,671 So the syntax is a little\ndifferent at the command prompt. 14462 13:53:46,671 --> 13:53:49,402 I just, on the left, have to\ncompile the code, with make 14463 13:53:49,402 --> 13:53:51,171 and then run it with ./speller. 14464 13:53:51,171 --> 13:53:52,891 On the right, I don't\nneed to compile it. 14465 13:53:52,891 --> 13:53:54,381 But I do need to use the interpreter. 14466 13:53:54,381 --> 13:53:56,752 So even though the lines are\nwrapping a little bit here 14467 13:53:56,752 --> 13:53:58,701 let me go ahead and run it on the right. 14468 13:53:58,701 --> 13:54:00,826 And I'm going to count how\nlong it takes, verbally 14469 13:54:02,091 --> 13:54:05,241 One Mississippi, two Mississippi,\nthree Mississippi, OK 14470 13:54:05,241 --> 13:54:07,711 so it's like three\nseconds, give or take. 14471 13:54:07,711 --> 13:54:10,042 Now running it in\nPython, keeping in mind 14472 13:54:10,042 --> 13:54:13,625 I spent way fewer hours implementing\na spell checker in Python 14473 13:54:13,625 --> 13:54:15,292 than you might have in problem set five. 14474 13:54:15,292 --> 13:54:18,529 But what's the trade-off going to be,\n 14475 13:54:18,529 --> 13:54:20,362 do we all now need to\nbe making consciously? 14476 13:54:20,362 --> 13:54:22,822 Here we go, on the right, in Python. 14477 13:54:22,822 --> 13:54:26,542 One Mississippi, two Mississippi,\n 14478 13:54:26,542 --> 13:54:30,592 five Mississippi, six Mississippi,\n 14479 13:54:30,591 --> 13:54:33,621 nine Mississippi, 10\nMississippi, 11 Mississippi 14480 13:54:33,622 --> 13:54:36,512 all right, so 10 or 11 seconds. 14481 13:54:38,502 --> 13:54:43,072 Let's go to the group here, which\n 14482 13:54:43,072 --> 13:54:47,302 How might you answer that question,\n 14483 13:54:48,052 --> 13:54:50,260 AUDIENCE: I think Python's\nbetter for the programmer 14484 13:54:50,260 --> 13:54:54,368 more comfortable for the programmer,\n 14485 13:54:54,368 --> 13:54:56,201 DAVID J. MALAN: OK, so\nPython, to summarize 14486 13:54:56,201 --> 13:54:59,981 is better for the programmer,\n 14487 13:54:59,982 --> 13:55:02,982 but C is maybe better for the computer,\n 14488 13:55:02,982 --> 13:55:04,649 I think that's a reasonable formulation. 14489 13:55:07,110 --> 13:55:09,402 AUDIENCE: I think it depends\non the size of the project 14490 13:55:10,432 --> 13:55:12,807 So if it's going to be something\nthat's relatively quick 14491 13:55:12,807 --> 13:55:15,232 I might not care that it\ntakes 10 seconds to do it. 14492 13:55:15,232 --> 13:55:17,432 And it could be way faster\nto do it with Python. 14493 13:55:17,432 --> 13:55:20,592 Whereas with C, if I'm dealing\n 14494 13:55:20,591 --> 13:55:24,822 set or something huge, then that\n 14495 13:55:24,822 --> 13:55:29,262 it might be worth it to put in the\n 14496 13:55:29,262 --> 13:55:32,781 so the process continually will run\n 14497 13:55:32,781 --> 13:55:33,951 DAVID J. MALAN: Absolutely,\na really good answer. 14498 13:55:33,951 --> 13:55:36,822 And let me summarize, is it depends\n 14499 13:55:36,822 --> 13:55:40,572 If you have a very large\ndata set, you might 14500 13:55:40,572 --> 13:55:43,650 want to optimize your code to be as\n 14501 13:55:43,650 --> 13:55:45,942 especially if you're running\nthat code again and again. 14502 13:55:45,942 --> 13:55:47,472 Maybe you're a company like Google. 14503 13:55:47,472 --> 13:55:49,632 People are searching a\nhuge database all the time. 14504 13:55:49,631 --> 13:55:52,271 You really want to squeeze\nevery bit of performance 14505 13:55:52,271 --> 13:55:53,743 as you can out of the computer. 14506 13:55:53,743 --> 13:55:56,201 You might want to have someone\nsmart take a language like C 14507 13:55:56,201 --> 13:55:57,971 and write it at a very low level. 14508 13:55:59,921 --> 13:56:02,671 They're going to have to deal with\n 14509 13:56:02,671 --> 13:56:06,011 But if and when it works correctly, it's\n 14510 13:56:06,012 --> 13:56:08,802 By contrast, if you have\na data set that's big 14511 13:56:08,802 --> 13:56:12,342 and 140,000 words is\nnot small, but you don't 14512 13:56:12,341 --> 13:56:15,461 want to spend like 5 hours,\n10 hours, a week of your time 14513 13:56:15,461 --> 13:56:17,584 building a spell\nchecker or a dictionary 14514 13:56:17,584 --> 13:56:20,502 you can instead leverage a different\n 14515 13:56:20,502 --> 13:56:25,211 and build on top of it, in order to\n 14516 13:56:27,362 --> 13:56:29,310 AUDIENCE: Would you,\nbecause with Python 14517 13:56:29,311 --> 13:56:33,450 doesn't it also like\nconvert the words, or like 14518 13:56:33,449 --> 13:56:35,060 convert the words, for a lesson? 14519 13:56:35,061 --> 13:56:37,103 When we convert that into\nthe same version again 14520 13:56:37,103 --> 13:56:40,670 do we just take that into view? 14521 13:56:40,669 --> 13:56:43,461 DAVID J. MALAN: That's a perfect\n 14522 13:56:43,461 --> 13:56:45,862 wanted to make, which was, is\nthere something in between? 14523 13:56:46,881 --> 13:56:49,491 I'm oversimplifying what this\nlanguage is actually doing. 14524 13:56:49,491 --> 13:56:51,801 It's not as stark a difference\nas saying, like, hey 14525 13:56:51,802 --> 13:56:54,862 Python is four times slower than C.\n 14526 13:56:54,862 --> 13:56:57,982 There are absolutely ways that\nengineers can optimize languages 14527 13:56:57,982 --> 13:56:59,752 as they have already done for Python. 14528 13:56:59,752 --> 13:57:02,362 And in fact, I've configured\nmy settings in such a way 14529 13:57:02,362 --> 13:57:05,298 that I've kind of dramatized\njust how big the difference is. 14530 13:57:05,298 --> 13:57:07,131 It is going to be slower,\nPython, typically 14531 13:57:07,131 --> 13:57:08,451 than the equivalent C program. 14532 13:57:08,451 --> 13:57:10,461 But it doesn't have\nto be as big of a gap 14533 13:57:10,461 --> 13:57:14,241 as it is here, because, indeed, among\n 14534 13:57:14,241 --> 13:57:16,641 is to save some intermediate results. 14535 13:57:16,641 --> 13:57:19,881 Technically speaking, yes,\nPython is interpreting 14536 13:57:19,881 --> 13:57:23,211 Dictionary.py and these\nother files, translating them 14537 13:57:23,211 --> 13:57:24,724 from one language to another. 14538 13:57:24,724 --> 13:57:27,891 But that doesn't mean it has to do that\n 14539 13:57:27,891 --> 13:57:33,542 As you propose, you can save, or cache,\n 14540 13:57:33,542 --> 13:57:36,961 So that the second time and the third\n 14541 13:57:36,961 --> 13:57:39,951 And, in fact, Python itself, the\n 14542 13:57:39,951 --> 13:57:42,502 thereof, itself is\nactually implemented in C. 14543 13:57:42,502 --> 13:57:45,811 So you can make sure that your\n 14544 13:57:45,811 --> 13:57:47,871 And what then is maybe\nthe high level takeaway? 14545 13:57:47,872 --> 13:57:50,842 Yes, if you are going to try to\nsqueeze every bit of performance 14546 13:57:50,841 --> 13:57:54,231 out of your code, and\nmaybe code is constrained. 14547 13:57:54,232 --> 13:57:55,672 Maybe you have very small devices. 14548 13:57:55,671 --> 13:57:57,292 Maybe it's like a watch nowadays. 14549 13:57:57,292 --> 13:58:02,842 Or maybe it's a sensor that's installed\n 14550 13:58:02,841 --> 13:58:06,231 or in infrastructure, where you\ndon't have much battery life 14551 13:58:06,232 --> 13:58:08,152 and you don't have much\nsize, you might want 14552 13:58:08,152 --> 13:58:10,232 to minimize just how\nmuch work is being done. 14553 13:58:10,232 --> 13:58:13,265 And so the faster the code runs,\n 14554 13:58:13,264 --> 13:58:14,932 if it's implemented something low level. 14555 13:58:14,932 --> 13:58:18,832 So C is still very commonly used\n 14556 13:58:18,832 --> 13:58:22,101 But, again, if you just want\nto solve real world problems 14557 13:58:22,101 --> 13:58:26,362 and get real work done, and your time\n 14558 13:58:26,362 --> 13:58:28,522 than the device you're\nrunning it on, long term 14559 13:58:28,521 --> 13:58:31,879 you know what, Python is among the\n 14560 13:58:31,879 --> 13:58:34,671 And frankly, if I were implementing\n 14561 13:58:34,671 --> 13:58:36,231 I'm probably starting with Python. 14562 13:58:36,232 --> 13:58:38,065 And I'm not going to\nwaste time implementing 14563 13:58:38,065 --> 13:58:41,452 all of that low-level stuff, because\n 14564 13:58:41,451 --> 13:58:45,981 modern languages is to use abstractions\n 14565 13:58:45,982 --> 13:58:49,432 And by abstraction, I mean something\n 14566 13:58:49,432 --> 13:58:51,891 that just gives you a\ndictionary, or hash table 14567 13:58:51,891 --> 13:58:55,747 or the equivalent version that I\n 14568 13:58:55,747 --> 13:58:59,242 All right, any questions,\nthen, on Python thus far? 14569 13:59:04,232 --> 13:59:06,442 AUDIENCE: Could you\ncompile the Python code 14570 13:59:06,442 --> 13:59:11,132 or is there some, I'd imagine that\n 14571 13:59:11,131 --> 13:59:14,701 but it feels like if you can just\n 14572 13:59:14,701 --> 13:59:16,614 that would give you the\nbest of both worlds. 14573 13:59:16,614 --> 13:59:18,781 DAVID J. MALAN: Really good\nquestion or observation 14574 13:59:18,781 --> 13:59:20,239 could you just compile Python code? 14575 13:59:20,239 --> 13:59:23,701 Yes, absolutely, this idea of\n 14576 13:59:23,701 --> 13:59:26,011 is not native to the language itself. 14577 13:59:26,012 --> 13:59:28,932 It tends to be native to the\nconventions that we humans use. 14578 13:59:28,932 --> 13:59:31,252 So you could actually\nwrite an interpreter for C 14579 13:59:31,252 --> 13:59:34,502 that would read it top to bottom, left\n 14580 13:59:34,502 --> 13:59:38,162 something the computer understands, but\n 14581 13:59:38,161 --> 13:59:40,081 C is generally a compiled language. 14582 13:59:41,192 --> 13:59:44,531 What Python nowadays is actually\n 14583 13:59:44,531 --> 13:59:46,741 It technically is, sort\nof unbeknownst to us 14584 13:59:46,741 --> 13:59:50,491 compiling the code, technically\n 14585 13:59:50,491 --> 13:59:54,031 into something called byte code,\n 14586 13:59:54,031 --> 13:59:58,031 just doesn't take as much time as it\n 14587 13:59:58,031 --> 14:00:00,898 And this is an area of research\nfor computer scientists working 14588 14:00:00,898 --> 14:00:03,481 in programming languages, to\nimprove these kinds of paradigms. 14589 14:00:04,021 --> 14:00:07,261 Well, honestly, for you and I, the\n 14590 14:00:07,262 --> 14:00:10,322 one, run the code and not worry\nabout the stupid second step 14591 14:00:10,322 --> 14:00:11,622 of compiling it all the time. 14592 14:00:12,122 --> 14:00:14,742 It's literally half as many\nsteps for me, the human. 14593 14:00:14,741 --> 14:00:17,021 And that's a nice thing to optimize for. 14594 14:00:17,021 --> 14:00:20,851 And ultimately, too, you might\n 14595 14:00:20,851 --> 14:00:22,441 come with these other languages. 14596 14:00:22,442 --> 14:00:24,482 So you should really\njust be fine-tuning how 14597 14:00:24,482 --> 14:00:28,322 you can enable these features, as\n 14598 14:00:28,322 --> 14:00:31,112 And, in fact, the only time\nI personally ever use C 14599 14:00:31,112 --> 14:00:34,472 is from like September to October\nof every year, during CS50. 14600 14:00:34,472 --> 14:00:36,872 Almost every other month\ndo I reach for Python 14601 14:00:36,872 --> 14:00:40,211 or another language called JavaScript,\n 14602 14:00:40,211 --> 14:00:44,161 which is not to impugn C. It's just that\n 14603 14:00:44,161 --> 14:00:47,551 fits for the amount of time I have to\n 14604 14:00:48,427 --> 14:00:50,927 All right, let's go ahead and\ntake a five minute break here. 14605 14:00:50,927 --> 14:00:53,912 And when we come back, we'll start\n 14606 14:00:54,822 --> 14:00:58,262 So let's go ahead and start writing\nsome code from the beginning 14607 14:00:58,262 --> 14:01:01,232 here, whereby we start small\nwith some simple examples 14608 14:01:01,232 --> 14:01:04,564 and then we'll build our way up to\n 14609 14:01:04,563 --> 14:01:06,271 But what we'll do\nalong the way is first 14610 14:01:06,271 --> 14:01:08,386 look side by side at\nwhat the C code looked 14611 14:01:08,387 --> 14:01:11,162 like way back in week 1\nor 2 or 3 and so forth 14612 14:01:11,161 --> 14:01:13,411 and then write the corresponding\nPython code at right. 14613 14:01:13,411 --> 14:01:16,051 And then we'll transition just\nto focusing on Python itself. 14614 14:01:16,052 --> 14:01:18,844 What I've done in advance today is\n 14615 14:01:18,843 --> 14:01:21,451 from the course's website,\nmy source 6 directory, which 14616 14:01:21,451 --> 14:01:24,346 contains all of the pre-written\nC code from weeks past. 14617 14:01:24,347 --> 14:01:26,222 But it'll also have\ncopies of the Python code 14618 14:01:26,222 --> 14:01:28,182 we'll write here together and look at. 14619 14:01:28,182 --> 14:01:31,967 So first, here is\nHello.c back from week 0. 14620 14:01:31,966 --> 14:01:33,844 And this was version 0 of it. 14621 14:01:33,845 --> 14:01:35,262 I'm going to go ahead and do this. 14622 14:01:35,262 --> 14:01:38,762 I'm going to go ahead and\nsplit my code window up here. 14623 14:01:38,762 --> 14:01:41,564 I'm going to go ahead and create\na new file called Hello.py. 14624 14:01:41,563 --> 14:01:43,771 And this isn't something\nyou'll typically have to do 14625 14:01:43,771 --> 14:01:45,331 laying your code out side by side. 14626 14:01:45,332 --> 14:01:47,402 But I've just clicked the\nlittle icon in VS Code 14627 14:01:47,402 --> 14:01:50,851 that looks like two columns, that\n 14628 14:01:50,851 --> 14:01:53,851 so that we can, in fact, see\nthings, for now, side by side 14629 14:01:53,851 --> 14:01:55,309 with my terminal window down below. 14630 14:01:55,309 --> 14:01:58,269 All right, now I'm going to go ahead\n 14631 14:01:58,269 --> 14:02:01,082 program on the right, which,\nrecall, was just print, quote 14632 14:02:01,082 --> 14:02:03,692 unquote, "Hello, world," and that\'s it. 14633 14:02:03,692 --> 14:02:05,942 Now down in my terminal\nwindow, I'm going 14634 14:02:05,942 --> 14:02:09,602 to go ahead and run Python of\nHello.py, Enter, and voila 14635 14:02:10,972 --> 14:02:13,472 So again, I'm not going to play\nany further with the C code. 14636 14:02:13,472 --> 14:02:15,452 It's there just to jog\nyour memory left and right. 14637 14:02:15,451 --> 14:02:17,761 So let's now look at a second\nversion of Hello, world 14638 14:02:17,762 --> 14:02:20,974 from that first week, whereby\nif I go and get Hello1.c 14639 14:02:20,974 --> 14:02:22,682 I'm going to drag that\nover to the right. 14640 14:02:22,682 --> 14:02:25,502 Whoops, I'm going to go ahead and\n 14641 14:02:25,502 --> 14:02:28,472 And now, on the right,\nlet's modify Hello.py 14642 14:02:28,472 --> 14:02:32,222 to look a little more like this\nsecond version in C, all right? 14643 14:02:32,222 --> 14:02:36,389 I want to get an answer from\nthe user as a return value 14644 14:02:36,389 --> 14:02:38,222 but I also want to get\nsome input from them. 14645 14:02:38,222 --> 14:02:41,942 So from CS50, I'm going to import the\n 14646 14:02:41,942 --> 14:02:43,692 We're going to get rid\nof that eventually 14647 14:02:43,692 --> 14:02:45,484 but for now, it's a\nhelpful training wheel. 14648 14:02:45,483 --> 14:02:47,701 And then down here, I'm\ngoing to say, answer 14649 14:02:47,701 --> 14:02:51,031 equals getString quote\nunquote, "What\'s your name"? 14650 14:02:52,502 --> 14:02:53,974 But no semicolon, no data type. 14651 14:02:53,974 --> 14:02:55,891 And then I'm going to\ngo ahead and print, just 14652 14:02:55,891 --> 14:03:01,639 like the first example on the slide,\n 14653 14:03:01,639 --> 14:03:03,182 And now let me go ahead and run this. 14654 14:03:03,182 --> 14:03:06,182 Python, of Hello.py, all right,\nit's asking me what's my name. 14655 14:03:07,891 --> 14:03:13,029 But it's worth calling attention to the\n 14656 14:03:13,029 --> 14:03:15,362 It's not just that the\nindividual functions are simpler. 14657 14:03:15,362 --> 14:03:18,991 What is also now glaringly omitted\nfrom my Python code at right 14658 14:03:18,991 --> 14:03:21,178 both in this version,\nand the previous version. 14659 14:03:21,178 --> 14:03:22,636 What did I not bother implementing? 14660 14:03:23,788 --> 14:03:26,371 DAVID J. MALAN: Yeah, so I didn't\neven need to implement main. 14661 14:03:26,372 --> 14:03:29,732 We'll revisit the main function,\nbecause having a main function 14662 14:03:29,732 --> 14:03:31,382 actually does solve problems sometimes. 14663 14:03:31,381 --> 14:03:32,612 But it's no longer required. 14664 14:03:32,612 --> 14:03:36,272 In C you have to have that to kick-start\n 14665 14:03:36,858 --> 14:03:39,691 And in fact, if you were missing\n 14666 14:03:39,692 --> 14:03:42,555 if you accidentally compiled\nHelpers.c instead of the file 14667 14:03:42,555 --> 14:03:44,972 that contained main, you would\nhave seen a compiler error. 14668 14:03:44,972 --> 14:03:46,180 In Python it's not necessary. 14669 14:03:46,180 --> 14:03:48,932 In Python you can just jump right\n 14670 14:03:49,872 --> 14:03:51,747 Especially if it's a\nsmall program like this 14671 14:03:51,747 --> 14:03:54,732 you don't need the added overhead\n 14672 14:03:54,732 --> 14:03:56,382 So that's one other difference here. 14673 14:03:56,381 --> 14:03:59,911 All right, there are a few other\nways we could say Hello, world. 14674 14:03:59,911 --> 14:04:02,681 Recall that I could use a format string. 14675 14:04:02,682 --> 14:04:06,881 So I could put this whole thing in\n 14676 14:04:06,881 --> 14:04:09,771 And then let me go ahead and\nrun Python of Hello.py again. 14677 14:04:09,771 --> 14:04:11,771 You can perhaps see where\nwe're going with this. 14678 14:04:11,771 --> 14:04:13,691 Let me type my name,\nDavid, and here we go. 14679 14:04:13,692 --> 14:04:16,092 OK, that's the mistake that\nsomeone identified earlier 14680 14:04:17,561 --> 14:04:21,461 Otherwise no variables are\ninterpolated, that is substituted 14681 14:04:22,911 --> 14:04:26,681 So if I go back in and add those\ncurly braces to the F string 14682 14:04:26,682 --> 14:04:31,154 now let me run Python of Hello.py,\n 14683 14:04:33,701 --> 14:04:37,061 But generally speaking, making\nshorter, more concise code 14684 14:04:38,391 --> 14:04:42,972 So stylistically, the F string is\n 14685 14:04:42,972 --> 14:04:45,802 All right, well, what more\ncan we do besides this? 14686 14:04:45,802 --> 14:04:48,702 Well, let me go ahead here and\n 14687 14:04:51,701 --> 14:04:54,671 Let me get rid of the CS50\nlibrary, which we will ultimately 14688 14:04:54,671 --> 14:04:56,141 in a couple of weeks, anyway. 14689 14:04:56,141 --> 14:04:59,082 I can't use getString,\nbut I can use a function 14690 14:04:59,082 --> 14:05:01,252 that comes with Python called input. 14691 14:05:01,252 --> 14:05:04,572 And, in fact, this is actually a\n 14692 14:05:04,572 --> 14:05:07,902 There's really no downside to\nusing input instead of getString. 14693 14:05:07,902 --> 14:05:09,942 We implement getString\njust for consistency 14694 14:05:09,942 --> 14:05:14,322 with what you saw in C. Python of\n 14695 14:05:14,322 --> 14:05:15,832 Still actually works the same. 14696 14:05:15,832 --> 14:05:17,749 So gone are the CS50\nspecific training wheels. 14697 14:05:17,749 --> 14:05:19,749 But we're going to bring\nthem back shortly, just 14698 14:05:19,749 --> 14:05:21,762 to deal with integers or\nfloats or other values 14699 14:05:21,762 --> 14:05:24,012 too, because it's going to make\nour lives a little simpler 14700 14:05:25,031 --> 14:05:28,871 All right, any questions, before we\n 14701 14:05:28,872 --> 14:05:32,802 from week 1, but now in Python? 14702 14:05:32,802 --> 14:05:34,632 All right, let me go\nahead and open up now. 14703 14:05:34,631 --> 14:05:39,761 Let's say Calculator0.c, which was one\n 14704 14:05:39,762 --> 14:05:43,391 math and operators like that, as\nwell as functions like getInt 14705 14:05:43,391 --> 14:05:48,341 let me go ahead and create a new\nfile now called Calculator.py 14706 14:05:48,341 --> 14:05:51,881 at right, so that I have\nmy C code at left still 14707 14:05:51,881 --> 14:05:53,471 and my Python code at right. 14708 14:05:53,472 --> 14:05:57,132 All right, let me go dive into a\n 14709 14:05:57,131 --> 14:05:59,621 I am going to use getInt\nfrom the CS50 library. 14710 14:06:01,482 --> 14:06:03,862 I'm going to go ahead now\nand get an Int from the user. 14711 14:06:03,862 --> 14:06:07,522 So x equals getInt, and I'll\nask them for an x value 14712 14:06:08,951 --> 14:06:14,322 No need to specify a semicolon,\nthough, or an Int for the x. 14713 14:06:15,461 --> 14:06:18,612 Y is going to get\nanother Int via y colon 14714 14:06:18,612 --> 14:06:23,351 and then down here, I'm going to\n 14715 14:06:23,351 --> 14:06:25,241 So this is already a bit new. 14716 14:06:25,241 --> 14:06:29,921 Recall, the C version required that\n 14717 14:06:30,949 --> 14:06:32,741 Python is just a little\nmore user-friendly. 14718 14:06:32,741 --> 14:06:36,191 If all you want to do is print out a\n 14719 14:06:36,192 --> 14:06:39,132 Don't futz with any percent\nsigns or format codes. 14720 14:06:39,131 --> 14:06:41,682 It's not printf, it's\nindeed just print now. 14721 14:06:41,682 --> 14:06:45,131 All right, let me go ahead and\nrun Python of Calculator.py 14722 14:06:45,131 --> 14:06:50,141 Enter, just do a quick sample,\n1 plus 2 indeed equals 3. 14723 14:06:50,141 --> 14:06:52,932 As an aside, suppose I had\ntaken a different approach 14724 14:06:52,932 --> 14:06:56,029 to importing the whole CS50 library,\n 14725 14:06:56,029 --> 14:06:58,072 You're not to notice any\nperformance impact here. 14726 14:06:59,211 --> 14:07:02,201 But notice what does not\nwork now, whereas it did work 14727 14:07:02,201 --> 14:07:07,631 in C. Python of Calculator.py, Enter,\n 14728 14:07:08,211 --> 14:07:10,091 So a traceback is just\na term of art that 14729 14:07:10,091 --> 14:07:13,731 says, here is a trace back\nthrough all of the functions 14730 14:07:14,771 --> 14:07:16,691 In the world of C, you\nmight call this a stack 14731 14:07:16,692 --> 14:07:19,459 trace, stack being the operative word. 14732 14:07:19,459 --> 14:07:21,792 Recall that when we talked\nabout the stack and the heap 14733 14:07:21,792 --> 14:07:24,599 the stack, like a stack of trays,\nwas all of the functions that 14734 14:07:24,599 --> 14:07:26,182 might get called, one after the other. 14735 14:07:26,182 --> 14:07:30,851 We had main, we had swap, then swap went\n 14736 14:07:30,851 --> 14:07:34,542 So here's a trace back of all of the\n 14737 14:07:34,542 --> 14:07:37,402 There's not really any functions\nother than my file itself. 14738 14:07:37,402 --> 14:07:38,872 Otherwise there'd be more detail. 14739 14:07:38,872 --> 14:07:42,102 But even though it's a little cryptic,\n 14740 14:07:42,101 --> 14:07:46,481 here, name error, so something related\n 14741 14:07:47,472 --> 14:07:50,711 And this of course, happens\non line 3 over there. 14742 14:07:52,042 --> 14:07:55,692 Well, Python essentially\nallows us to namespace 14743 14:07:55,692 --> 14:07:58,272 our functions that come from libraries. 14744 14:07:58,271 --> 14:08:01,811 There was a problem in C. If\nyou were using the CS50 library 14745 14:08:01,811 --> 14:08:03,701 and thus had access\nto getInt, getString 14746 14:08:03,701 --> 14:08:06,371 and so forth, you could\nnot use another library 14747 14:08:06,372 --> 14:08:08,112 that had the same function names. 14748 14:08:08,112 --> 14:08:10,031 They would collide, and\nthe compiler would not 14749 14:08:10,031 --> 14:08:12,551 know how to link them\ntogether correctly. 14750 14:08:12,552 --> 14:08:18,042 In Python, and other languages\nlike JavaScript, and in Java 14751 14:08:18,042 --> 14:08:21,792 you have support for effectively\n 14752 14:08:21,792 --> 14:08:26,891 You can isolate variables and\n 14753 14:08:26,891 --> 14:08:29,112 like their own container in memory. 14754 14:08:29,112 --> 14:08:32,082 And what this means is,\nif you import all of CS50 14755 14:08:32,082 --> 14:08:36,252 you have to say that the getInt you\n 14756 14:08:36,252 --> 14:08:39,701 So just like with the image\nblurring, and the image edges 14757 14:08:39,701 --> 14:08:44,951 before, where I had to specify image dot\n 14758 14:08:44,951 --> 14:08:48,491 am I specifying with a dot operator,\n 14759 14:08:48,491 --> 14:08:50,932 want CS50.getInt in both places. 14760 14:08:50,932 --> 14:08:54,641 And now if I rerun Python\nof Calculator.py, 1 and 2 14761 14:08:57,311 --> 14:09:01,311 Generally speaking, it depends\non just how many functions 14762 14:09:01,311 --> 14:09:02,561 you're using from the library. 14763 14:09:02,561 --> 14:09:05,561 If you're using a whole bunch of\n 14764 14:09:05,561 --> 14:09:09,854 If you're only using maybe one\nor two, import them line by line. 14765 14:09:09,855 --> 14:09:12,272 All right, so let's go ahead\nand make a little tweak here. 14766 14:09:12,271 --> 14:09:15,438 Let's get rid of this library\nand take this training wheel off 14767 14:09:15,438 --> 14:09:18,271 too, as quickly as we introduced\n 14768 14:09:18,271 --> 14:09:20,831 you'll be able to use all\nof these same functions. 14769 14:09:20,832 --> 14:09:24,632 Suppose I get rid of this, and\nI just use the input function 14770 14:09:24,631 --> 14:09:28,231 just like I did by\nreplacing getString earlier. 14771 14:09:28,232 --> 14:09:31,232 Let me go ahead now and run\nthis version of the code. 14772 14:09:31,232 --> 14:09:37,486 Python of Calculator.py, OK,\nhow about 1 plus 2 equals 3. 14773 14:09:39,182 --> 14:09:41,851 All right, obviously wrong, incorrect. 14774 14:09:41,851 --> 14:09:46,411 Can anyone explain what just\nhappened, based on instincts? 14775 14:09:47,911 --> 14:09:49,141 AUDIENCE: You want an answer? 14776 14:09:49,141 --> 14:09:50,266 DAVID J. MALAN: Sure, yeah. 14777 14:09:50,266 --> 14:09:54,451 AUDIENCE: Say you have a number\nof strings that don't have Ints 14778 14:09:54,451 --> 14:09:57,841 so you would part with them and\nsay, printing one, two, better. 14779 14:09:57,841 --> 14:10:01,171 DAVID J. MALAN: Exactly, Python\nis interpreting, or treating 14780 14:10:01,171 --> 14:10:03,332 both x and y as strings,\nwhich is actually 14781 14:10:03,332 --> 14:10:05,641 what the input function\nreturns by default. 14782 14:10:05,641 --> 14:10:08,671 And so plus is now being interpreted\n 14783 14:10:09,182 --> 14:10:12,302 So x plus y isn't x\nplus y mathematically 14784 14:10:12,302 --> 14:10:15,002 but in terms of string\njoining, just like in Scratch. 14785 14:10:15,002 --> 14:10:18,211 So that's why we're getting\n12, or really one two 14786 14:10:18,211 --> 14:10:19,561 which isn't itself a number. 14787 14:10:20,701 --> 14:10:22,471 So we somehow need to convert things. 14788 14:10:22,472 --> 14:10:25,561 And we didn't have this\nability quite as easily in C. 14789 14:10:25,561 --> 14:10:29,191 We did have like the A to i\nfunction, ASCII to integer 14790 14:10:29,192 --> 14:10:30,792 which did allow you to do this. 14791 14:10:30,792 --> 14:10:35,912 The analog in Python is actually just\n 14792 14:10:35,911 --> 14:10:39,271 So just like in C, you\ncan use the keyword Int 14793 14:10:39,271 --> 14:10:41,021 but you use it a little differently. 14794 14:10:41,021 --> 14:10:45,822 Notice that I'm not doing parenthesis\n 14795 14:10:45,822 --> 14:10:47,531 I'm using Int as a function. 14796 14:10:47,531 --> 14:10:49,951 So indeed, in Python, Int is a function. 14797 14:10:49,951 --> 14:10:53,131 Float is a function, that\nyou can pass values into 14798 14:10:53,131 --> 14:10:54,792 to do this kind of conversion. 14799 14:10:54,792 --> 14:10:58,531 So now, if I run Python\nof Calculator.py, 1 and 2 14800 14:10:58,531 --> 14:11:01,951 now we're back in business,\nand getting the answer of 3. 14801 14:11:01,951 --> 14:11:03,761 But there's kind of a catch here. 14802 14:11:03,762 --> 14:11:04,952 There's always going to be a trade-off. 14803 14:11:04,951 --> 14:11:07,081 Like that sounds amazing that\nit just works in this way. 14804 14:11:07,082 --> 14:11:08,972 We can throw away the\nCS50 library already. 14805 14:11:08,972 --> 14:11:13,652 But what if the user accidentally\n 14806 14:11:13,652 --> 14:11:15,557 like a cat, instead of a number. 14807 14:11:15,557 --> 14:11:17,432 Damn, well, there's one\nof these trace backs. 14808 14:11:17,432 --> 14:11:19,302 Like, now my program has crashed. 14809 14:11:19,302 --> 14:11:21,864 This is similar in spirit\nto the kinds of segfaults 14810 14:11:21,864 --> 14:11:23,072 that you might have had in C. 14811 14:11:23,072 --> 14:11:24,362 But they're not segfaults per se. 14812 14:11:24,362 --> 14:11:26,029 It doesn't necessarily relate to memory. 14813 14:11:26,029 --> 14:11:31,812 This time it relates to actual\n 14814 14:11:31,811 --> 14:11:34,771 So this time it's not a name\nerror, it's a value error 14815 14:11:34,771 --> 14:11:39,101 invalid literal for Int with\nbase 10 quote unquote "cat. 14816 14:11:39,101 --> 14:11:43,322 So, again, it's written for sort\nof a programmer, more than sort 14817 14:11:43,322 --> 14:11:46,171 of a typical person, because it's\n 14818 14:11:46,171 --> 14:11:47,421 But let's try to interpret it. 14819 14:11:47,421 --> 14:11:51,383 Invalid literal, a literal is just\n 14820 14:11:51,383 --> 14:11:52,841 is the function name, with base 10. 14821 14:11:52,841 --> 14:11:54,691 It's just defaulting to decimal numbers. 14822 14:11:54,692 --> 14:11:56,937 Cat is apparently not a decimal number. 14823 14:11:56,936 --> 14:11:59,561 It doesn't look like it, therefore\nit can't be treated like it. 14824 14:11:59,561 --> 14:12:01,451 Therefore, there's a value error. 14825 14:12:03,271 --> 14:12:06,721 Unfortunately, you would have\nto somehow catch this error. 14826 14:12:06,722 --> 14:12:08,972 And the only way to do\nthat in Python really 14827 14:12:08,972 --> 14:12:11,491 is by way of another\nfeature that C did not have 14828 14:12:11,491 --> 14:12:13,921 namely, what are called exceptions. 14829 14:12:13,921 --> 14:12:18,601 An exception is exactly what just\n 14830 14:12:18,601 --> 14:12:22,112 They are things that can go wrong\n 14831 14:12:22,112 --> 14:12:27,192 that aren't necessarily going to be\n 14832 14:12:27,192 --> 14:12:32,762 So in Python, and in JavaScript, and in\n 14833 14:12:32,762 --> 14:12:35,762 there's this ability to\nactually try to do something 14834 14:12:35,762 --> 14:12:37,537 except if something goes wrong. 14835 14:12:37,536 --> 14:12:39,661 And in fact, I'm going to\nintroduce a bit of syntax 14836 14:12:39,661 --> 14:12:42,078 here, even though we won't\nhave to use this much just yet. 14837 14:12:42,078 --> 14:12:46,502 Instead of just blindly converting\nx to an Int, let me go ahead 14838 14:12:48,491 --> 14:12:51,902 And if there's an exception,\ngo ahead and say something 14839 14:12:51,902 --> 14:12:58,802 like print, that is not an Int. 14840 14:12:58,802 --> 14:13:02,060 And then I'm going to do\nsomething like exit, right there. 14841 14:13:02,059 --> 14:13:03,601 And let me go ahead and do this here. 14842 14:13:03,601 --> 14:13:07,891 Let me try to get y, except\nif there's an exception. 14843 14:13:07,891 --> 14:13:12,519 Then let me go ahead and say, again,\n 14844 14:13:12,519 --> 14:13:14,851 And then I'm going to exit\nfrom there to, otherwise I'll 14845 14:13:14,851 --> 14:13:16,381 go ahead and print x plus y. 14846 14:13:16,381 --> 14:13:22,981 If I run Python of\nCalculator.py now, whoops, oh 14847 14:13:22,982 --> 14:13:25,202 forgot my close quote, sorry. 14848 14:13:25,201 --> 14:13:31,081 All right, so close quote, Python of\n 14849 14:13:31,082 --> 14:13:34,322 But if I try to type in\nsomething wrong like cat, now 14850 14:13:34,322 --> 14:13:35,832 it actually detects the error. 14851 14:13:35,832 --> 14:13:38,372 So what is the CS50\nlibrary in Python doing? 14852 14:13:38,372 --> 14:13:42,122 It's actually doing that try and accept\n 14853 14:13:42,122 --> 14:13:45,062 otherwise your programs for\nsomething simple, like a calculator 14854 14:13:45,061 --> 14:13:46,421 start to get longer and longer. 14855 14:13:46,421 --> 14:13:49,682 So we factored that kind of\nlogic out to the CS50 getInt 14856 14:13:49,682 --> 14:13:51,211 function and get float function. 14857 14:13:51,211 --> 14:13:55,305 But underneath the hood, they're\n 14858 14:13:55,305 --> 14:13:56,972 but they're being a little more precise. 14859 14:13:56,972 --> 14:14:00,972 They're detecting a specific error,\n 14860 14:14:00,972 --> 14:14:03,572 so that these functions will\nget executed again and again. 14861 14:14:03,572 --> 14:14:07,232 In fact, the best way to do this is to\n 14862 14:14:07,232 --> 14:14:10,600 then print that error\nmessage out to the user. 14863 14:14:10,599 --> 14:14:13,391 And again, let's not get too into\n 14864 14:14:13,391 --> 14:14:15,281 We've already put into the CS50 library. 14865 14:14:15,281 --> 14:14:17,582 But that's why, for instance,\nwe bootstrap things 14866 14:14:17,582 --> 14:14:20,942 by just using these\nfunctions out of the box. 14867 14:14:20,942 --> 14:14:24,132 All right, let's do something\nmore with our calculator here. 14868 14:14:25,531 --> 14:14:28,411 In the world of C, we\nhad another version 14869 14:14:28,411 --> 14:14:33,511 of this code, which actually\ndid some division by way of-- 14870 14:14:33,512 --> 14:14:38,202 which actually did division of\n 14871 14:14:38,201 --> 14:14:42,511 So let me go ahead and close the C\n 14872 14:14:42,512 --> 14:14:44,463 now, doing some of these\nsame lines of codes. 14873 14:14:44,463 --> 14:14:46,171 But I'm going to go\nahead and just assume 14874 14:14:46,171 --> 14:14:48,661 that the user is going to\ncooperate and use proper input. 14875 14:14:48,661 --> 14:14:52,831 So from CS50, import getInt, that\n 14876 14:14:52,832 --> 14:15:00,162 X gets getInt, ask the user\nfor an Int x, y equals getInt 14877 14:15:01,692 --> 14:15:03,531 And then, let's go ahead and do this. 14878 14:15:03,531 --> 14:15:07,631 Let's declare a variable called\n 14879 14:15:07,631 --> 14:15:09,371 Then let's go ahead and print z. 14880 14:15:09,372 --> 14:15:13,762 Still no need for a format string, I\n 14881 14:15:13,762 --> 14:15:15,762 Let me go ahead and run\nPython of Calculator.py. 14882 14:15:15,762 --> 14:15:20,172 Let me do 1, 10, and I get 0.1. 14883 14:15:20,171 --> 14:15:25,781 What did I get in C,\nthough, if you think back. 14884 14:15:25,781 --> 14:15:28,597 What would we have happened in C? 14885 14:15:29,942 --> 14:15:32,162 DAVID J. MALAN: Yeah, we\nwould have gotten zero in C. 14886 14:15:32,161 --> 14:15:34,519 But why, in C, when you\ndivide one Int by another 14887 14:15:34,519 --> 14:15:36,436 and those Ints are like\n1 and 10 respectively? 14888 14:15:36,436 --> 14:15:38,199 AUDIENCE: It'll give\nyou an integer back. 14889 14:15:38,199 --> 14:15:39,781 DAVID J. MALAN: It will give you what? 14890 14:15:40,864 --> 14:15:44,432 DAVID J. MALAN: It will give you an\n 14891 14:15:44,432 --> 14:15:46,381 the integer part of it is indeed zero. 14892 14:15:46,381 --> 14:15:48,491 So this was an example of truncation. 14893 14:15:48,491 --> 14:15:51,061 So truncation was an\nissue in C. But it would 14894 14:15:51,061 --> 14:15:53,972 seem as though this is no\nlonger a problem in Python 14895 14:15:53,972 --> 14:15:57,811 insofar as the division operator\nactually handles that for us. 14896 14:15:57,811 --> 14:16:00,752 As an aside, if you want the old\nbehavior, because it actually 14897 14:16:00,752 --> 14:16:03,542 is sometimes useful for\nrounding or flooring values 14898 14:16:03,542 --> 14:16:06,092 you can actually use two slashes. 14899 14:16:06,091 --> 14:16:08,141 And now you get the C behavior. 14900 14:16:08,141 --> 14:16:10,231 So that now 1 divided by 10 is zero. 14901 14:16:10,232 --> 14:16:12,752 So you don't give up that\ncapability, but at least it 14902 14:16:12,752 --> 14:16:14,131 does a more sensible default. 14903 14:16:14,131 --> 14:16:17,551 Most people, especially new programmers,\n 14904 14:16:17,552 --> 14:16:20,522 would want to get 0.1,\nnot 0, for reasons 14905 14:16:20,521 --> 14:16:22,621 that indeed we had to explain weeks ago. 14906 14:16:22,622 --> 14:16:26,461 But what about another problem we\n 14907 14:16:26,461 --> 14:16:28,561 whereby there is imprecision? 14908 14:16:28,561 --> 14:16:31,502 Let me go ahead and, somewhat\n 14909 14:16:32,381 --> 14:16:34,862 I'm going to format\nit using an f-string. 14910 14:16:34,862 --> 14:16:39,241 And I'm going to go ahead and format,\n 14911 14:16:39,972 --> 14:16:43,141 Notice this, if I do Python\nof Calculator.py, 1 and 10 14912 14:16:43,141 --> 14:16:46,292 I get, by default, just\none significant digit. 14913 14:16:46,292 --> 14:16:50,442 But if I use this syntax in Python,\n 14914 14:16:50,442 --> 14:16:53,072 I can actually do in\nC like I did before 14915 14:16:53,072 --> 14:16:56,171 50 significant digits\nafter the decimal point. 14916 14:16:56,171 --> 14:17:00,542 So now let me rerun Python\nof Calculator.py 1 and 10 14917 14:17:00,542 --> 14:17:03,512 and let's see if floating point\nimprecision is still with us. 14918 14:17:04,802 --> 14:17:07,472 And you can see as much here,\nthe f-string, the format string 14919 14:17:07,472 --> 14:17:10,512 is just showing us now 50 digits\ninstead of the default one. 14920 14:17:10,512 --> 14:17:12,632 So we've not solved all problems. 14921 14:17:12,631 --> 14:17:15,366 But we have solved at least some. 14922 14:17:15,366 --> 14:17:18,241 All right, before we pivot away from\n 14923 14:17:18,241 --> 14:17:21,871 now on syntax or concepts or the like? 14924 14:17:22,591 --> 14:17:25,841 AUDIENCE: Do you think\nthe double slash you get 14925 14:17:25,841 --> 14:17:28,458 has merit, how do you comment on that? 14926 14:17:28,459 --> 14:17:29,792 DAVID J. MALAN: How do you what? 14927 14:17:30,750 --> 14:17:33,932 Really good question, if you're\nusing double slash for division 14928 14:17:33,932 --> 14:17:36,391 with flooring or truncation,\nlike I described 14929 14:17:36,391 --> 14:17:38,372 how do you do a comment in Python. 14930 14:17:39,902 --> 14:17:42,452 And the convention is actually\nto use a complete sentence 14931 14:17:42,451 --> 14:17:43,994 like with a capital T here. 14932 14:17:43,995 --> 14:17:46,412 You don't need a period unless\nthere's multiple sentences. 14933 14:17:46,411 --> 14:17:49,361 And technically, it should be above\n 14934 14:17:49,362 --> 14:17:51,641 So you would use a hash symbol instead. 14935 14:17:53,942 --> 14:17:57,272 All right, let's go ahead and make\n 14936 14:17:57,271 --> 14:17:59,951 Let me go ahead and\nopen up, for instance 14937 14:17:59,951 --> 14:18:05,612 an example called Points1.c,\nwhich we saw a few weeks back. 14938 14:18:05,612 --> 14:18:10,052 And let me go ahead on the other side\n 14939 14:18:10,052 --> 14:18:13,412 This was a program, recall, that\n 14940 14:18:13,411 --> 14:18:15,909 lost on the first assignment. 14941 14:18:15,910 --> 14:18:17,702 And then it went ahead\nand just printed out 14942 14:18:17,701 --> 14:18:20,311 whether they lost fewer points\nthan me, because I lost two 14943 14:18:20,311 --> 14:18:23,638 if you recall the photo, more points\n 14944 14:18:23,639 --> 14:18:26,222 Let me go ahead and zoom out so\nwe can see a bit more of this. 14945 14:18:26,222 --> 14:18:30,730 And let me now, on the top right here,\n 14946 14:18:30,730 --> 14:18:33,272 So I want to first prompt the\nuser for some number of points. 14947 14:18:33,271 --> 14:18:37,061 So from CS50 let's import getInt,\n 14948 14:18:37,061 --> 14:18:39,932 Let's then do points\nequals getInt, and ask 14949 14:18:39,932 --> 14:18:43,951 the user, how many points\ndid you lose, question mark. 14950 14:18:43,951 --> 14:18:48,511 Then let's go ahead and say, if points\n 14951 14:18:48,512 --> 14:18:52,322 print, you lost fewer points than me. 14952 14:18:52,322 --> 14:18:59,792 Otherwise, if it's else if points\n 14953 14:18:59,792 --> 14:19:03,592 you lost more points than me. 14954 14:19:03,591 --> 14:19:07,322 Else let's go ahead and handle\nthe final scenario, which is you 14955 14:19:07,322 --> 14:19:11,122 lost the same number of points as me. 14956 14:19:11,122 --> 14:19:15,752 Before I run this, does anyone want to\n 14957 14:19:16,252 --> 14:19:17,912 AUDIENCE: Else if has to be elif. 14958 14:19:17,911 --> 14:19:21,211 DAVID J. MALAN: Yeah, so else if in\n 14959 14:19:22,302 --> 14:19:26,312 So let me change this to elif, and now\n 14960 14:19:26,311 --> 14:19:29,851 suppose you lost three\npoints on some assignment. 14961 14:19:29,851 --> 14:19:31,711 You lost more points than my two. 14962 14:19:31,711 --> 14:19:34,330 If you only lost one point,\nyou lost fewer points than me. 14963 14:19:35,372 --> 14:19:37,562 But notice the code is much tighter. 14964 14:19:37,561 --> 14:19:41,222 In 10 total lines, we did in\nwhat was 24 lines, because we've 14965 14:19:41,222 --> 14:19:42,872 thrown away a lot of the syntax. 14966 14:19:42,872 --> 14:19:44,891 The curly braces are\nno longer necessary. 14967 14:19:44,891 --> 14:19:46,752 The parentheses are\ngone, the semicolons. 14968 14:19:46,752 --> 14:19:50,192 So this is why it just tends to\nbe more pleasant pretty quickly 14969 14:19:52,832 --> 14:19:55,292 All right, let's do\none other example here. 14970 14:19:55,292 --> 14:19:59,522 In C, recall that we were able to\n 14971 14:19:59,521 --> 14:20:01,111 if something is even or odd. 14972 14:20:01,112 --> 14:20:05,522 Well, in Python, let me go ahead\n 14973 14:20:05,521 --> 14:20:09,331 and let's look for a moment\nat the C version at left. 14974 14:20:09,332 --> 14:20:13,202 Here was the code in C that we used\n 14975 14:20:13,201 --> 14:20:16,322 And, really, the key\ntakeaway from all these lines 14976 14:20:16,322 --> 14:20:17,811 was just the remainder operator. 14977 14:20:17,811 --> 14:20:19,061 And that one is still with us. 14978 14:20:19,061 --> 14:20:21,519 So this is a simple demonstration,\njust to make that point 14979 14:20:21,519 --> 14:20:25,291 if in Python, I want to determine\n 14980 14:20:25,292 --> 14:20:29,671 Well, let's go ahead and from CS50,\n 14981 14:20:29,671 --> 14:20:35,131 and get a number like n from the user,\n 14982 14:20:35,131 --> 14:20:40,741 And then let's go ahead and say,\nif n percent sign 2 equals 0 14983 14:20:40,741 --> 14:20:44,792 then let\'s go ahead and\nprint quote unquote "Even. 14984 14:20:44,792 --> 14:20:50,275 Else let's go ahead and print\nout Odd, but before I run this 14985 14:20:50,275 --> 14:20:53,192 anyone want to instinctively, even\n 14986 14:20:57,332 --> 14:20:58,957 DAVID J. MALAN: Yeah, so double equals. 14987 14:20:58,957 --> 14:21:02,372 Again, so even though some of the stuff\n 14988 14:21:02,951 --> 14:21:05,041 So this, too, should\nbe a double equal sign 14989 14:21:05,042 --> 14:21:07,141 because I'm comparing for equality here. 14990 14:21:07,141 --> 14:21:08,675 And why is this the right math? 14991 14:21:08,675 --> 14:21:10,592 Well, if you divide a\nnumber by 2, it's either 14992 14:21:10,591 --> 14:21:12,811 going to have 0 or 1 as a remainder. 14993 14:21:12,811 --> 14:21:15,551 And that's going to determine\nif it's even or odd for us. 14994 14:21:15,552 --> 14:21:18,722 So let's run Python of Parity.py,\ntype in a number like 50 14995 14:21:18,722 --> 14:21:21,182 and hopefully we get, indeed, even. 14996 14:21:21,182 --> 14:21:23,432 So again, same idea, but now\nwe're down to eight lines 14997 14:21:25,082 --> 14:21:27,332 Well, let's now do something\na little more interactive 14998 14:21:27,332 --> 14:21:31,202 and a little representative of tools\n 14999 14:21:31,201 --> 14:21:36,841 In C, recall that we had this\nagreement program, Agree.c. 15000 14:21:36,841 --> 14:21:40,801 And then let's go ahead and implement\n 15001 14:21:42,391 --> 14:21:45,091 And let's look at the C version first. 15002 14:21:45,091 --> 14:21:47,221 On the left, we used get char here. 15003 14:21:47,222 --> 14:21:49,711 And then we used the\ndouble vertical bars 15004 14:21:49,711 --> 14:21:52,951 to check if C is equal to\ncapital Y or lowercase y. 15005 14:21:52,951 --> 14:21:55,021 And then we did the\nsame thing for n for no. 15006 14:21:55,021 --> 14:22:00,901 And so let's go over here and\nlet's do from CS50, import get-- 15007 14:22:00,902 --> 14:22:03,092 OK, get char is not a thing. 15008 14:22:03,091 --> 14:22:05,611 And this here is another\ndifference with Python. 15009 14:22:05,612 --> 14:22:09,031 There is no data type for\nindividual characters. 15010 14:22:09,031 --> 14:22:11,161 You have strings, STRs,\nand, honestly, those 15011 14:22:11,161 --> 14:22:13,141 are fine, because if\nyou have a STR that's 15012 14:22:13,141 --> 14:22:15,481 just one character, for\nall intents and purposes 15013 14:22:15,482 --> 14:22:17,232 it is just a single character. 15014 14:22:17,232 --> 14:22:18,482 So it's just a simplification. 15015 14:22:18,482 --> 14:22:19,722 You don't have to think as much. 15016 14:22:19,722 --> 14:22:22,180 You don't have to worry about\ndouble quotes, single quotes. 15017 14:22:22,180 --> 14:22:25,872 In fact, in Python, you can use\ndouble quotes or single quotes 15018 14:22:25,872 --> 14:22:27,452 so long as you're consistent. 15019 14:22:27,451 --> 14:22:29,491 So long as you're\nconsistent, the single quotes 15020 14:22:29,491 --> 14:22:32,191 do not mean something\ndifferent, like they do in C. 15021 14:22:32,192 --> 14:22:34,862 So I'm going to go ahead\nand use getString here 15022 14:22:34,862 --> 14:22:37,741 although, strictly speaking, I\n 15023 14:22:39,002 --> 14:22:43,771 I'm going to get a string from the\n 15024 14:22:43,771 --> 14:22:47,078 quote unquote, "Do you agree," like a\n 15025 14:22:47,078 --> 14:22:50,161 where you have to say yes or no, you\n 15026 14:22:51,101 --> 14:22:54,631 And then let's translate the\nconditionals to Python, now, too. 15027 14:22:54,631 --> 14:23:02,371 So if S equals equals quote-unquote\n 15028 14:23:02,372 --> 14:23:08,702 let's go ahead and print out agreed,\n 15029 14:23:08,701 --> 14:23:12,061 equals N or S equals equals little n. 15030 14:23:12,061 --> 14:23:14,579 Let's go ahead, then,\nand print out not agreed. 15031 14:23:14,580 --> 14:23:17,372 And you can already see, perhaps,\n 15032 14:23:17,372 --> 14:23:20,222 Is Python a little more\nEnglish-like, in that 15033 14:23:20,222 --> 14:23:24,132 you just literally use the English word\n 15034 14:23:24,131 --> 14:23:26,891 But it's ultimately\ndoing the same thing. 15035 14:23:26,891 --> 14:23:29,911 Can we simplify this code a bit, though. 15036 14:23:29,911 --> 14:23:31,861 This would be a little\nannoying if we wanted 15037 14:23:31,862 --> 14:23:34,322 to add support, not just\nfor big Y and little y 15038 14:23:34,322 --> 14:23:40,752 but Yes or big Yes or little yes or\n 15039 14:23:40,752 --> 14:23:43,652 There's a lot of permutations\nof Y-E-S or just y 15040 14:23:43,652 --> 14:23:45,241 that we ideally should tolerate. 15041 14:23:45,241 --> 14:23:47,991 Otherwise, the user is going to\n 15042 14:23:47,991 --> 14:23:49,292 which isn't very user-friendly. 15043 14:23:49,292 --> 14:23:51,572 Any intuition for how\nwe could logically 15044 14:23:51,572 --> 14:23:54,792 even if you don't know how to\ndo it in code, make this better? 15045 14:23:55,292 --> 14:23:58,057 AUDIENCE: Write way over\nthe list, and then up 15046 14:23:58,057 --> 14:23:59,432 it's like the things in the list. 15047 14:23:59,432 --> 14:24:03,572 DAVID J. MALAN: Nice, yeah, we saw an\n 15048 14:24:03,572 --> 14:24:06,421 Why don't we take that same\nidea and ask a similar question. 15049 14:24:06,421 --> 14:24:11,341 If S is in the following list\nof values, Y or little y 15050 14:24:11,341 --> 14:24:15,122 or heck, let me add to the list\n 15051 14:24:15,122 --> 14:24:17,300 And it's going to get a\nlittle annoying, admittedly 15052 14:24:17,300 --> 14:24:20,271 but this is still better than the\n 15053 14:24:20,271 --> 14:24:22,161 I could do things like\nthis, and so forth. 15054 14:24:22,161 --> 14:24:24,261 There's a whole bunch more permutations. 15055 14:24:24,262 --> 14:24:26,992 But let's leave this alone,\nand let me just go into here 15056 14:24:26,991 --> 14:24:33,800 and change this to, if S is in the\n 15057 14:24:33,800 --> 14:24:36,981 and I won't do as, let's just not\n 15058 14:24:39,322 --> 14:24:42,472 Python of Agree.py, do I agree? 15059 14:24:45,262 --> 14:24:46,881 All right, how about big Yes. 15060 14:24:46,881 --> 14:24:48,372 OK, that does not seem to work. 15061 14:24:48,372 --> 14:24:50,872 Notice it did not say agreed,\nand it did not say not agreed. 15062 14:24:53,701 --> 14:24:57,291 Well, you know what I could\ndo, what I don't really 15063 14:24:57,292 --> 14:24:58,762 need the uppercase and lowercase. 15064 14:24:58,762 --> 14:25:00,711 Let me tighten this\nlist up a little bit. 15065 14:25:00,711 --> 14:25:04,162 And why don't I just\nforce S to be lowercase. 15066 14:25:04,161 --> 14:25:07,521 S.lower, recall, whether\nit's one character or more 15067 14:25:07,521 --> 14:25:10,701 is a function built into\nSTRs now, strings in Python 15068 14:25:10,701 --> 14:25:12,471 that forces the whole\nthing to lowercase. 15069 14:25:13,972 --> 14:25:19,222 Python of Agree.py, little y,\nthat works, big Y, that works. 15070 14:25:19,222 --> 14:25:24,362 Big Yes, that works, big Y,\nlittle e, big S, that also works. 15071 14:25:24,362 --> 14:25:27,432 So we've now handled, in one fell\n 15072 14:25:27,432 --> 14:25:29,432 And you know what, we can\ntighten this up a bit. 15073 14:25:29,432 --> 14:25:32,872 Here's an opportunity, in Python,\nfor slightly better design. 15074 14:25:32,872 --> 14:25:36,592 What have I done in here\nthat's a little redundant? 15075 14:25:36,591 --> 14:25:40,701 Does anyone see an opportunity\nto eliminate a redundancy 15076 14:25:40,701 --> 14:25:43,341 doing something more\ntimes than you need. 15077 14:25:45,052 --> 14:25:47,685 AUDIENCE: You can do S dot lower, above. 15078 14:25:47,684 --> 14:25:49,851 DAVID J. MALAN: We could\nmove the S dot lower above. 15079 14:25:49,851 --> 14:25:51,832 Notice that I'm using S dot lower twice. 15080 14:25:51,832 --> 14:25:54,391 But it's going to give me\nthe same answer both times. 15081 14:25:54,391 --> 14:25:56,601 So I could do a couple of things here. 15082 14:25:56,601 --> 14:26:01,222 I could, first of all, get rid of\n 15083 14:26:01,222 --> 14:26:05,241 and then above this, maybe I could\n 15084 14:26:05,241 --> 14:26:08,121 I can't just do this, because\nthat throws the value away. 15085 14:26:08,122 --> 14:26:10,762 It does the math, but it doesn't\nconvert the string itself. 15086 14:26:10,762 --> 14:26:12,362 It's going to return a value. 15087 14:26:12,362 --> 14:26:14,781 So I have to say S equals s.lower. 15088 14:26:15,862 --> 14:26:18,362 Or, honestly, I can chain\nthese things together. 15089 14:26:18,362 --> 14:26:22,592 And this is not something we saw in\n 15090 14:26:22,591 --> 14:26:25,761 and strings have functions\nlike lower in them 15091 14:26:25,762 --> 14:26:28,851 you can chain these functions\n 15092 14:26:28,851 --> 14:26:30,309 dot that, dot this other thing. 15093 14:26:30,309 --> 14:26:33,351 And eventually you want to stop,\n 15094 14:26:33,351 --> 14:26:35,332 But this is reasonable,\nstill fits on the screen. 15095 14:26:36,082 --> 14:26:38,211 It does in one place\nwhat I was doing in two. 15096 14:26:39,531 --> 14:26:42,502 Let me go ahead and do Python\nof Agree.py one last time. 15097 14:26:43,641 --> 14:26:46,881 And it's still working as intended. 15098 14:26:46,881 --> 14:26:49,221 Also if I tried those\nother inputs as well. 15099 14:26:49,957 --> 14:26:55,812 AUDIENCE: Could you add on like a for\n 15100 14:26:55,811 --> 14:26:59,222 and then cover all the functions where\n 15101 14:26:59,222 --> 14:27:01,972 where it's uppercase as well, or\n 15102 14:27:05,616 --> 14:27:06,991 DAVID J. MALAN: Let me summarize. 15103 14:27:06,991 --> 14:27:09,862 Could we handle uppercase and\nlowercase together in some form? 15104 14:27:09,862 --> 14:27:11,542 I'm actually doing that already. 15105 14:27:12,891 --> 14:27:15,828 I have to either be all lowercase\nin my logic or all uppercase 15106 14:27:15,828 --> 14:27:17,661 and not worry about\nwhat the human types in 15107 14:27:17,661 --> 14:27:19,761 because no matter what\nthe human types in, I'm 15108 14:27:19,762 --> 14:27:21,472 forcing their input to lowercase. 15109 14:27:21,472 --> 14:27:24,802 And then I am using a\nlowercase list of values. 15110 14:27:24,802 --> 14:27:26,042 If I want to flip that, fine. 15111 14:27:26,042 --> 14:27:27,561 I just have to be self-consistent. 15112 14:27:27,561 --> 14:27:28,941 But I'm handling that already. 15113 14:27:29,745 --> 14:27:33,475 AUDIENCE: Are strings no\nlonger an array of characters? 15114 14:27:33,474 --> 14:27:35,391 DAVID J. MALAN: A really\ngood loaded questions 15115 14:27:35,391 --> 14:27:38,601 are strings no longer\nan array of characters? 15116 14:27:38,601 --> 14:27:40,641 Conceptually, yes,\nunderneath the hood, no. 15117 14:27:40,641 --> 14:27:42,711 They're a little more\nsophisticated than that 15118 14:27:42,711 --> 14:27:45,112 because with strings,\nyou have a few changes. 15119 14:27:45,112 --> 14:27:47,122 Not only do they have\nfunctions built into them 15120 14:27:47,122 --> 14:27:49,102 because strings are now\nwhat we call objects 15121 14:27:49,101 --> 14:27:51,021 in what's called\nobject-oriented programming. 15122 14:27:51,021 --> 14:27:53,563 And we're going to keep seeing\nexamples of this dot operator. 15123 14:27:53,563 --> 14:27:58,072 They are also immutable, so\nto speak, I-M-M-U-T-A-B-L-E. 15124 14:27:58,072 --> 14:28:01,701 Immutable means they cannot be\nchanged, which means, unlike C 15125 14:28:01,701 --> 14:28:05,271 you can't go into a string and\nchange its individual characters. 15126 14:28:05,271 --> 14:28:08,002 You can make a copy of the\nstring that makes a change 15127 14:28:08,002 --> 14:28:10,220 but you can't change the\noriginal string itself. 15128 14:28:10,220 --> 14:28:12,262 This is both a little\nannoying, maybe, sometimes. 15129 14:28:12,262 --> 14:28:14,887 But it's also pretty protective,\nbecause you can't do screw-ups 15130 14:28:14,887 --> 14:28:18,202 like I did weeks ago, when I was\ntrying to copy S and call it T. 15131 14:28:18,201 --> 14:28:19,791 And then one affected the other. 15132 14:28:19,792 --> 14:28:23,601 Python, underneath the hood, is\n 15133 14:28:23,601 --> 14:28:25,072 and the pointers and all of that. 15134 14:28:25,072 --> 14:28:27,561 There are no pointers in Python. 15135 14:28:27,561 --> 14:28:32,362 So If that wasn't clear, all of that\n 15136 14:28:32,362 --> 14:28:36,802 is now handled by the language\n 15137 14:28:36,802 --> 14:28:38,962 All right, so let's\nintroduce maybe some loops 15138 14:28:38,961 --> 14:28:40,911 like we've been in the habit of doing. 15139 14:28:40,911 --> 14:28:44,691 Let me open up Meow.c, which was\nan example in C, just meowing 15140 14:28:46,252 --> 14:28:49,322 Let me create a file called\nMeow.py here on the right. 15141 14:28:49,322 --> 14:28:51,711 And notice on the left,\nthis was correct code in C 15142 14:28:51,711 --> 14:28:53,192 but it was kind of poorly designed. 15143 14:28:53,692 --> 14:28:55,972 Because it was a missed\nopportunity for a loop. 15144 14:28:55,972 --> 14:28:58,982 Why say something three times\nwhen you can say it just once? 15145 14:28:58,982 --> 14:29:02,512 So in Python, let me do it\nthe poorly designed way first. 15146 14:29:03,921 --> 14:29:07,731 And, like I generally should not,\n 15147 14:29:07,732 --> 14:29:10,192 run Python of Meow.py, and it works. 15148 14:29:11,839 --> 14:29:13,881 So let me go ahead and\nimprove this a little bit. 15149 14:29:13,881 --> 14:29:15,511 And there's a few ways to do this. 15150 14:29:15,512 --> 14:29:20,572 If I wanted to do this three times, I\n 15151 14:29:20,572 --> 14:29:24,531 For i in range of 3, recall that\nthat was the better version 15152 14:29:24,531 --> 14:29:27,891 rather than arbitrarily enumerate\n 15153 14:29:27,891 --> 14:29:30,012 and print out quote unquote "Meow. 15154 14:29:30,012 --> 14:29:32,599 Now if I run Python of\nMeow, still seems to work. 15155 14:29:32,599 --> 14:29:34,432 So it's a little tighter,\nand, my God, like 15156 14:29:34,432 --> 14:29:36,474 programs can't really get\nmuch shorter than this. 15157 14:29:36,474 --> 14:29:40,822 We're down to two lines of code, no\n 15158 14:29:40,822 --> 14:29:43,101 Let's now improve the\ndesign further, like we 15159 14:29:43,101 --> 14:29:46,072 did in C, by introducing\na function called 15160 14:29:46,072 --> 14:29:47,752 meow, that actually does the meowing. 15161 14:29:47,752 --> 14:29:49,521 So this was our first\nabstraction, recall 15162 14:29:49,521 --> 14:29:54,621 both in Scratch and in C. Let me focus\n 15163 14:29:55,281 --> 14:30:00,006 Let me go ahead and\nfirst define a function. 15164 14:30:03,411 --> 14:30:06,771 Let me first go ahead and do\nthis, for i in range of 3 15165 14:30:06,771 --> 14:30:09,951 let's assume for the moment\nthat there's a meow function 15166 14:30:09,951 --> 14:30:11,241 that I'm just going to call. 15167 14:30:11,241 --> 14:30:14,841 Let's now go ahead and define, using\n 15168 14:30:14,841 --> 14:30:17,691 with the speller\ndemonstration, a function 15169 14:30:17,692 --> 14:30:19,402 called meow that takes no arguments. 15170 14:30:19,402 --> 14:30:21,982 And all it does for now is print meow. 15171 14:30:21,982 --> 14:30:27,142 Let me now go ahead and run\nPython of Meow.py Enter, huh, one 15172 14:30:28,472 --> 14:30:30,601 So this is another name error. 15173 14:30:30,601 --> 14:30:33,601 And, again, name meow is not defined. 15174 14:30:33,601 --> 14:30:35,601 What's your instinct here,\neven though we've not 15175 14:30:35,601 --> 14:30:37,281 tripped over this yet in Python? 15176 14:30:37,281 --> 14:30:39,652 Where does your mind go here? 15177 14:30:40,192 --> 14:30:42,602 AUDIENCE: Does it read top\nto bottom, left to right? 15178 14:30:42,601 --> 14:30:46,121 I'm guessing we could find a new case. 15179 14:30:46,122 --> 14:30:49,542 DAVID J. MALAN: Perfect, as smart,\n 15180 14:30:49,542 --> 14:30:51,292 it still makes certain assumptions. 15181 14:30:51,292 --> 14:30:54,531 And if it hasn't seen a keyword\nyet, it just doesn't exist. 15182 14:30:54,531 --> 14:30:57,521 So if you want it to exist, we\nhave to be a little clever here. 15183 14:30:57,521 --> 14:31:00,611 I could just put it, flip\nit around, like this. 15184 14:31:00,612 --> 14:31:02,991 But this honestly isn't\nparticularly good design. 15185 14:31:03,491 --> 14:31:06,911 Because now, if you, the reader\nof your code, whether you 15186 14:31:06,911 --> 14:31:09,491 wrote it or someone else, you\nkind of have to go fishing now. 15187 14:31:09,491 --> 14:31:11,081 Like where does this program begin? 15188 14:31:11,082 --> 14:31:14,652 And even though, yes, it's obvious\n 15189 14:31:14,652 --> 14:31:17,232 like, if the file were longer,\nyou're going to be annoyed 15190 14:31:17,232 --> 14:31:19,702 and fishing visually for\nthe right lines of code. 15191 14:31:20,919 --> 14:31:22,752 And indeed, this would\nbe a common paradigm. 15192 14:31:22,752 --> 14:31:25,902 When you want to start having\n 15193 14:31:25,902 --> 14:31:29,982 just put your own code in main, so that,\n 15194 14:31:29,982 --> 14:31:32,172 you can solve the problem\nwe just encountered. 15195 14:31:32,171 --> 14:31:35,381 So let me define a function called\nmain that has that same loop 15196 14:31:38,561 --> 14:31:43,871 Let me go into my terminal and\nrun Python of Meow.py, Enter. 15197 14:31:47,021 --> 14:31:50,572 All right, investigate this. 15198 14:31:50,572 --> 14:31:52,811 What could explain this symptom. 15199 14:31:52,811 --> 14:31:54,542 I have not told you the answer yet. 15200 14:31:54,542 --> 14:31:56,292 So all you have is\nyour instinct, assuming 15201 14:31:56,292 --> 14:31:58,241 you've never touched Python before. 15202 14:31:58,241 --> 14:32:03,322 What might explain this symptom,\nwhere nothing is meowing? 15203 14:32:03,822 --> 14:32:05,491 AUDIENCE: Didn't run the main function. 15204 14:32:05,491 --> 14:32:07,699 DAVID J. MALAN: Yeah, I\ndidn't run the main function. 15205 14:32:07,699 --> 14:32:09,911 So in C, this is functionality\nyou get for free. 15206 14:32:09,911 --> 14:32:11,286 You have to have a main function. 15207 14:32:11,286 --> 14:32:14,101 But, heck, so long as you make\nit, it will be called for you. 15208 14:32:14,101 --> 14:32:17,911 In Python, this is just a convention,\nto create a main function 15209 14:32:17,911 --> 14:32:19,721 borrowing a very common name for it. 15210 14:32:19,722 --> 14:32:22,842 But if you want to call that\nmain function, you have to do it. 15211 14:32:22,841 --> 14:32:24,631 So this looks a little\nweird, admittedly 15212 14:32:24,631 --> 14:32:26,551 that you have to call your\nown main function now 15213 14:32:26,552 --> 14:32:28,382 and it has to be at\nthe bottom of the file 15214 14:32:28,381 --> 14:32:31,561 because only once the interpreter\n 15215 14:32:31,561 --> 14:32:34,981 have all of your functions\nbeen defined, higher up. 15216 14:32:34,982 --> 14:32:36,512 But this solves both problems. 15217 14:32:36,512 --> 14:32:38,972 It keeps your code, that's\nthe main part of your code 15218 14:32:38,972 --> 14:32:40,182 at the very top of the file. 15219 14:32:40,182 --> 14:32:43,502 So it's just obvious to you, and\n 15220 14:32:43,502 --> 14:32:45,662 where the program logically starts. 15221 14:32:45,661 --> 14:32:49,831 But it also ensures that main is not\n 15222 14:32:52,182 --> 14:32:54,169 So this is another\nperfect example of we're 15223 14:32:54,169 --> 14:32:55,961 learning a new language\nfor the first time. 15224 14:32:55,961 --> 14:32:57,542 You're not going to have heard\nall of the answers before. 15225 14:32:57,542 --> 14:33:01,351 Just apply some logic, as to, like, all\n 15226 14:33:01,351 --> 14:33:04,711 Start to infer how the\nlanguage does or doesn't work. 15227 14:33:04,711 --> 14:33:08,972 If I now go and run this, Python of\n 15228 14:33:08,972 --> 14:33:11,882 And just so you have\nseen it, there is a quote 15229 14:33:11,881 --> 14:33:15,362 unquote "better" way of doing this,\n 15230 14:33:15,362 --> 14:33:18,572 are not going to encounter,\ncertainly in these initial days. 15231 14:33:18,572 --> 14:33:21,961 Typically, you would see in\nonline tutorials or books 15232 14:33:21,961 --> 14:33:25,921 something that looks like this, where\n 15233 14:33:27,332 --> 14:33:30,992 That's functionally the same thing,\n 15234 14:33:30,991 --> 14:33:34,362 if we ourselves were implementing a\n 15235 14:33:34,362 --> 14:33:37,404 But we're going to keep things simpler\n 15236 14:33:37,404 --> 14:33:39,877 because we're not going to\nencounter that problem just yet. 15237 14:33:39,877 --> 14:33:42,752 All right, let's make one change to\n 15238 14:33:42,752 --> 14:33:47,942 In C, the last version of meow also\n 15239 14:33:47,942 --> 14:33:50,432 took arguments to the function meow. 15240 14:33:50,432 --> 14:33:53,012 So suppose that I want\nto factor this out. 15241 14:33:53,012 --> 14:33:55,772 And I want to just call meow as a\n 15242 14:33:55,771 --> 14:33:57,601 say meow this number of times. 15243 14:33:57,601 --> 14:34:00,811 And I figure out how many times\n 15244 14:34:00,811 --> 14:34:03,511 or using getInt or something\nlike that, to figure out 15245 14:34:05,072 --> 14:34:08,341 Well, now, I have to define\ninside my meow function, in input 15246 14:34:08,341 --> 14:34:14,851 let's call it n, and then use that,\n 15247 14:34:14,851 --> 14:34:18,161 let me go ahead and print\nout meow that many times. 15248 14:34:18,161 --> 14:34:20,341 So again, the only thing\nthat's different in C 15249 14:34:20,341 --> 14:34:24,151 is we don't bother specifying return\n 15250 14:34:24,152 --> 14:34:28,752 and we don't bother specifying the\n 15251 14:34:28,752 --> 14:34:31,451 So same ideas, simpler in some sense. 15252 14:34:31,451 --> 14:34:33,182 We're just throwing away keystrokes. 15253 14:34:33,182 --> 14:34:35,972 All right, let me run this one\nfinal time, Python of Meow.py 15254 14:34:35,972 --> 14:34:38,912 and we still have the same program. 15255 14:34:38,911 --> 14:34:40,631 All right, let me pause here. 15256 14:34:41,302 --> 14:34:42,552 And I know this is going fast. 15257 14:34:42,552 --> 14:34:47,877 But hopefully, the C code\nis still somewhat familiar. 15258 14:34:48,377 --> 14:34:54,052 AUDIENCE: Is there any difference\n 15259 14:34:54,052 --> 14:34:55,302 DAVID J. MALAN: Good question. 15260 14:34:55,302 --> 14:34:57,760 Is there any difference between\nglobal and local variables? 15261 14:34:57,760 --> 14:35:00,372 Short answer, yes, and we would\nrun into that same problem 15262 14:35:00,372 --> 14:35:01,842 if we declare a variable\nin one function 15263 14:35:01,841 --> 14:35:03,966 another function is not\ngoing to have access to it. 15264 14:35:03,966 --> 14:35:07,181 We can solve that by\nputting variables globally. 15265 14:35:07,182 --> 14:35:09,281 But we don't have all of\nthe features we had in C 15266 14:35:09,281 --> 14:35:11,682 like there's no such thing\nas a constant in Python. 15267 14:35:11,682 --> 14:35:13,421 The mentality in the\nPython community is 15268 14:35:13,421 --> 14:35:16,002 if you don't want some value\nto change, don't touch it. 15269 14:35:17,152 --> 14:35:18,762 So there's trade-offs here, too. 15270 14:35:18,762 --> 14:35:21,522 Some languages are stronger\nor more defensive than that. 15271 14:35:21,521 --> 14:35:25,511 But that, too, is part of the mindset\n 15272 14:35:27,167 --> 14:35:29,459 AUDIENCE: There is really\nonly one green line, in the-- 15273 14:35:29,459 --> 14:35:30,959 DAVID J. MALAN: Oh, sorry, where's-- 15274 14:35:31,601 --> 14:35:34,864 AUDIENCE: There has only been\none green line printed at a time. 15275 14:35:34,864 --> 14:35:36,572 DAVID J. MALAN: That\nis an amazing segue. 15276 14:35:36,572 --> 14:35:37,891 Let's come to that in just\na moment, because we're 15277 14:35:37,891 --> 14:35:40,141 going to recreate also\nthat Mario example, where 15278 14:35:40,141 --> 14:35:43,447 we had like the question marks for\n 15279 14:35:43,447 --> 14:35:45,072 So let's come back to that in a second. 15280 14:35:46,177 --> 14:35:49,883 AUDIENCE: If strings are immutable,\n 15281 14:35:49,883 --> 14:35:51,841 DAVID J. MALAN: Correct,\nstrings are immutable. 15282 14:35:51,841 --> 14:35:55,741 Any time you seem to be modifying\n 15283 14:35:57,002 --> 14:35:59,461 So it's taking a little\nmore memory somewhere. 15284 14:35:59,461 --> 14:36:02,667 But you don't have to deal with\nit Python's doing that for you. 15285 14:36:02,667 --> 14:36:05,414 AUDIENCE: So you don't free anything. 15286 14:36:05,413 --> 14:36:06,621 DAVID J. MALAN: Say it again? 15287 14:36:07,747 --> 14:36:11,184 AUDIENCE: You don't free\nlike taking leave on stuff. 15288 14:36:11,184 --> 14:36:12,851 DAVID J. MALAN: You don't free anything. 15289 14:36:12,851 --> 14:36:15,391 So if you weren't a big fan,\nover the past couple of weeks 15290 14:36:15,391 --> 14:36:19,381 of malloc or free or\nmemory or addresses, or all 15291 14:36:19,381 --> 14:36:21,511 of those low level\nimplementation details 15292 14:36:21,512 --> 14:36:23,912 Python is the language for\nyou, because all of that 15293 14:36:23,911 --> 14:36:25,861 is handled for you automatically. 15294 14:36:28,982 --> 14:36:34,766 AUDIENCE: Each up for the variable, you\n 15295 14:36:36,222 --> 14:36:40,307 Well, if there isn't a main function in\n 15296 14:36:40,307 --> 14:36:42,432 DAVID J. MALAN: How do you\ndefine a global variable 15297 14:36:42,432 --> 14:36:44,014 if there's no main function in Python? 15298 14:36:44,014 --> 14:36:48,002 Global variables, by definition, always\n 15299 14:36:49,002 --> 14:36:51,822 If I wanted to have a\nfunction that's outside of 15300 14:36:51,822 --> 14:36:56,224 and, therefore, global to\nall of these, like global-- 15301 14:36:56,224 --> 14:36:59,141 actually, don't use the word global,\n 15302 14:36:59,141 --> 14:37:03,972 variable equals Foo, F-O-O,\njust as an arbitrary string 15303 14:37:03,972 --> 14:37:07,932 value that a computer scientist would\n 15304 14:37:07,932 --> 14:37:10,521 There are some caveats, though,\nas to how you access that. 15305 14:37:10,521 --> 14:37:12,531 But let's come back\nto that another time. 15306 14:37:12,531 --> 14:37:14,551 But that problem is solvable, too. 15307 14:37:15,052 --> 14:37:16,302 So let's go ahead and do this. 15308 14:37:16,302 --> 14:37:19,572 To come back to the question about\n 15309 14:37:19,572 --> 14:37:21,822 and create a file now called Mario.py. 15310 14:37:21,822 --> 14:37:24,222 Won't bother showing the C code anymore. 15311 14:37:24,222 --> 14:37:26,112 We'll focus just on\nthe new language here. 15312 14:37:26,112 --> 14:37:31,061 But recall that, in Python, in Mario, we\n 15313 14:37:31,061 --> 14:37:34,121 This was a random screen from\nthe side scroller version 1 15314 14:37:35,322 --> 14:37:39,341 And we just want to print like three\n 15315 14:37:39,341 --> 14:37:41,471 Well, in Python, we could\ndo something like this 15316 14:37:41,472 --> 14:37:47,802 print, oh, sorry, for i in the range of\n 15317 14:37:48,349 --> 14:37:50,141 And I think this is\npretty straightforward. 15318 14:37:50,141 --> 14:37:52,781 Python of Mario.py, we\nget our three hashes. 15319 14:37:52,781 --> 14:37:55,371 You could imagine\nparameterizing this now, though 15320 14:37:55,372 --> 14:37:56,872 and getting actual user input. 15321 14:37:58,252 --> 14:38:03,942 Let me go up here and let me go\n 15322 14:38:03,942 --> 14:38:07,612 and then let's get the\ninput from the user. 15323 14:38:07,612 --> 14:38:09,732 So it actually is a\nvalue n, like, all right 15324 14:38:09,732 --> 14:38:14,712 getInt the height of the column\nof bricks that you want to do. 15325 14:38:14,711 --> 14:38:18,792 And then, let's go ahead and print\n 15326 14:38:20,082 --> 14:38:21,906 Let's print out like five hashes. 15327 14:38:21,906 --> 14:38:24,281 OK, one, two, three, four,\nfive, that seems to work, too. 15328 14:38:24,281 --> 14:38:26,199 And it's going to work\nfor any positive value. 15329 14:38:26,199 --> 14:38:29,921 But it's not going to work\nfor, how about negative 1? 15330 14:38:29,921 --> 14:38:31,182 That just doesn't do anything. 15331 14:38:32,269 --> 14:38:35,351 But also recall that it's not going\n 15332 14:38:35,351 --> 14:38:40,512 weird, like, oh, sorry, it is going\n 15333 14:38:42,311 --> 14:38:45,341 We're using CS50's\ngetInt function, which is 15334 14:38:45,341 --> 14:38:48,231 handling all of those headaches for us. 15335 14:38:48,232 --> 14:38:51,702 But, what if the user indeed\ntypes a negative number? 15336 14:38:52,631 --> 14:38:54,381 So that was the bug I\nwanted to highlight. 15337 14:38:54,381 --> 14:38:56,771 It would be nice to re-prompt\nthem and re-prompt them. 15338 14:38:56,771 --> 14:38:59,081 And in C, what was the\nprogramming construct we 15339 14:38:59,082 --> 14:39:01,542 used when we wanted to\nask the user a question. 15340 14:39:01,542 --> 14:39:05,802 And then, if they didn't cooperate,\n 15341 14:39:07,271 --> 14:39:08,621 DAVID J. MALAN: Yeah,\ndo while loop, right? 15342 14:39:08,622 --> 14:39:11,352 That was useful, because it's\nalmost the same as a while loop. 15343 14:39:11,351 --> 14:39:14,621 But instead of checking a\ncondition, and then doing something 15344 14:39:14,622 --> 14:39:16,470 you do something and\nthen check a condition 15345 14:39:16,470 --> 14:39:18,762 which makes sense with user\ninput, because what are you 15346 14:39:18,762 --> 14:39:21,137 even going to check if the\nuser hasn't done anything yet? 15347 14:39:21,137 --> 14:39:22,722 You need that inverted logic. 15348 14:39:22,722 --> 14:39:26,531 Unfortunately in Python,\nthere is no do while loop. 15349 14:39:29,262 --> 14:39:32,112 And frankly, those are\nenough to recreate this idea. 15350 14:39:32,112 --> 14:39:35,682 And the way to do this in\nPython, the Pythonic way, which 15351 14:39:35,682 --> 14:39:38,682 is another term of art in the\ncommunity, is to say this. 15352 14:39:38,682 --> 14:39:42,822 Deliberately induce an infinite loop,\n 15353 14:39:42,822 --> 14:39:46,451 And then do what you got to do,\nlike get an Int from a user 15354 14:39:46,451 --> 14:39:48,581 asking them for the\nheight of this thing. 15355 14:39:48,582 --> 14:39:54,792 And then, if that is what you want, like\n 15356 14:39:56,542 --> 14:40:01,961 So this is how, in Python, you could\n 15357 14:40:01,961 --> 14:40:03,836 You deliberately induce\nan infinite loop. 15358 14:40:03,836 --> 14:40:05,711 So something's going to\nhappen at least once. 15359 14:40:05,711 --> 14:40:08,801 Then, if you get the answer\nyou want, you break out of it 15360 14:40:08,802 --> 14:40:10,852 effectively achieving the same logic. 15361 14:40:10,851 --> 14:40:13,601 So this is the Pythonic way\nof doing a do while loop. 15362 14:40:13,601 --> 14:40:18,281 Let me go ahead and run Python\nof Mario.py, type in 3 this time. 15363 14:40:18,281 --> 14:40:21,192 And now I get back just\nthe 3 hashes as well. 15364 14:40:21,192 --> 14:40:26,832 What if, though, I wanted to\nget rid of, how about ultimately 15365 14:40:26,832 --> 14:40:31,580 that CS50 library function, and\n 15366 14:40:31,580 --> 14:40:33,622 Well, let's go ahead and\ntweak this a little bit. 15367 14:40:33,622 --> 14:40:35,592 Let me go ahead and\nremove this temporarily. 15368 14:40:35,591 --> 14:40:38,201 Give myself a main function, so\nI don't make the same mistake 15369 14:40:39,881 --> 14:40:43,631 And let me give myself a function called\n 15370 14:40:43,631 --> 14:40:47,141 And inside of that function\nis going to be that same code. 15371 14:40:47,141 --> 14:40:50,802 But I don't want to break in\nthis case, I want to return n. 15372 14:40:50,802 --> 14:40:53,815 So, recall, that if you return\nfrom a function, you're done 15373 14:40:53,815 --> 14:40:55,732 you're going to exit\nfrom right at that point. 15374 14:40:56,841 --> 14:40:59,201 You can just say return\nn inside of the loop 15375 14:40:59,201 --> 14:41:01,841 or, if you would prefer\nto break out, you 15376 14:41:01,841 --> 14:41:03,461 could do something like this instead. 15377 14:41:03,461 --> 14:41:09,222 Break, and then down here,\nyou could return, down here 15378 14:41:11,152 --> 14:41:13,812 And let me make one point here\nbefore we go back up to main. 15379 14:41:13,811 --> 14:41:18,011 This is a little different\nfrom C. And this one's subtle. 15380 14:41:18,012 --> 14:41:23,772 What have I done here that in C would\n 15381 14:41:27,381 --> 14:41:28,741 It's super subtle, this one. 15382 14:41:29,241 --> 14:41:32,432 AUDIENCE: So aren't we like\ndefining mostly object 15383 14:41:32,432 --> 14:41:35,991 like we're using it\nfirst, defining an object? 15384 14:41:40,796 --> 14:41:43,671 DAVID J. MALAN: So similar, it's\n 15385 14:41:43,671 --> 14:41:47,502 So it's OK not to declare a\nvariable with like the data type. 15386 14:41:47,502 --> 14:41:51,942 We've addressed that before, but on line\n 15387 14:41:51,942 --> 14:41:55,122 And then we return n on line 12. 15388 14:41:56,711 --> 14:42:01,932 In the world of C, if we had declared\n 15389 14:42:01,932 --> 14:42:04,722 it would have been scoped\nto that loop, which 15390 14:42:04,722 --> 14:42:08,052 means as soon as you get out of that\n 15391 14:42:09,862 --> 14:42:12,612 It would be local to the\ncurly braces therein. 15392 14:42:12,612 --> 14:42:16,241 Here, logically, curly braces\nare gone, but the indentation 15393 14:42:16,241 --> 14:42:20,771 makes clear that n is still inside of\n 15394 14:42:20,771 --> 14:42:23,801 But n is actually still\nin scope in Python. 15395 14:42:23,802 --> 14:42:26,902 The moment you create a variable\n 15396 14:42:26,902 --> 14:42:30,281 It is available everywhere within\nthat function, even outside 15397 14:42:30,281 --> 14:42:32,211 of the loop in which you defined it. 15398 14:42:32,211 --> 14:42:35,591 So this logic is actually OK in Python. 15399 14:42:35,591 --> 14:42:38,659 In C, recall, to solve\nthis same problem 15400 14:42:38,660 --> 14:42:41,202 we would have had to do something\na little hackish like this 15401 14:42:41,201 --> 14:42:46,121 like define n up here on line 8,\n 15402 14:42:46,122 --> 14:42:48,522 and so that it exists on line 13. 15403 14:42:48,521 --> 14:42:52,221 That is no longer an\nissue or need, in Python. 15404 14:42:52,222 --> 14:42:54,222 Once you create a variable,\neven if it's nested 15405 14:42:54,222 --> 14:42:56,389 nested, nested inside of\nsome loops or conditionals 15406 14:42:56,389 --> 14:43:00,042 it still exists within\nthe function itself. 15407 14:43:00,042 --> 14:43:04,391 All right, any questions then on this,\n 15408 14:43:04,391 --> 14:43:08,201 rid of the CS50 library again? 15409 14:43:08,201 --> 14:43:10,822 OK, so let me go ahead and\nget the height from the user. 15410 14:43:10,822 --> 14:43:13,279 Let's go ahead and create a\nvariable in main called height. 15411 14:43:13,279 --> 14:43:14,981 Let's call this get height function. 15412 14:43:14,982 --> 14:43:19,902 And then let's use that height value,\n 15413 14:43:19,902 --> 14:43:21,522 And let me see if this all works now. 15414 14:43:22,932 --> 14:43:25,631 Hopefully, I haven't\nmessed up, but I did. 15415 14:43:25,631 --> 14:43:27,981 But this is an easy fix now. 15416 14:43:28,482 --> 14:43:29,607 AUDIENCE: Got to call main. 15417 14:43:29,607 --> 14:43:31,065 DAVID J. MALAN: I got to call main. 15418 14:43:31,065 --> 14:43:32,502 So again, I deleted that earlier. 15419 14:43:33,442 --> 14:43:34,650 So I'm actually calling main. 15420 14:43:34,650 --> 14:43:38,711 Let me rerun Python of\nMario.py, there we go, height 3. 15421 14:43:40,402 --> 14:43:42,402 So let's do one last\nthing with Mario, just 15422 14:43:42,402 --> 14:43:45,502 to tie together that idea now\nof exceptions from before. 15423 14:43:45,502 --> 14:43:47,591 Again, exceptions are\na feature of Python 15424 14:43:47,591 --> 14:43:49,581 whereby you can try to do something. 15425 14:43:49,582 --> 14:43:53,232 And if there's a problem, you can\n 15426 14:43:53,232 --> 14:43:56,592 Previously, I handled it by just yelling\n 15427 14:43:56,591 --> 14:43:59,981 But let's actually use this to\nre-implement CS50's own getInt 15428 14:44:00,762 --> 14:44:03,652 Let me throw away\nCS50's getInt function. 15429 14:44:03,652 --> 14:44:09,402 And now let me go ahead and\nreplace getInt with input. 15430 14:44:09,402 --> 14:44:12,192 But it's not sufficient\nto just use input. 15431 14:44:12,192 --> 14:44:16,002 What do I have to add to\nthis line of code on line 8? 15432 14:44:16,002 --> 14:44:17,262 If I want to get back an Int? 15433 14:44:18,311 --> 14:44:20,353 DAVID J. MALAN: Yeah, I\nhave to cast it to an Int 15434 14:44:20,353 --> 14:44:23,021 by calling the Int\nfunction around that value 15435 14:44:23,021 --> 14:44:25,271 or I could do it on a separate\nline, just to be clear. 15436 14:44:25,271 --> 14:44:28,631 I could also do n equals Int of n. 15437 14:44:28,631 --> 14:44:31,542 That would work too, but it's\nsort of an unnecessary extra line. 15438 14:44:31,542 --> 14:44:34,512 This is not sufficient, because\nthat does not change the value. 15439 14:44:35,457 --> 14:44:36,582 But then it throws it away. 15440 14:44:37,713 --> 14:44:40,421 So the conventional way to do this\n 15441 14:44:40,421 --> 14:44:41,879 just to keep things nice and tight. 15442 14:44:43,302 --> 14:44:47,992 If I run Python of Mario.py, I can\n 15443 14:44:47,991 --> 14:44:52,241 I can still type in negative 1, because\n 15444 14:44:52,241 --> 14:44:55,271 What I'm not yet handling\nis weird input like cat 15445 14:44:55,271 --> 14:44:58,281 or some string that is\nnot a base 10 number. 15446 14:44:58,281 --> 14:45:00,402 So here, again, is my traceback. 15447 14:45:00,402 --> 14:45:03,522 And notice that here, let\nme scroll up a little bit 15448 14:45:03,521 --> 14:45:08,141 here we can actually see\nmore detail in the traceback. 15449 14:45:08,141 --> 14:45:13,421 Notice that, just like in C, or just\n 15450 14:45:14,622 --> 14:45:18,012 You can see mention of module, that\n 15451 14:45:18,012 --> 14:45:19,535 is my main function, and get height. 15452 14:45:19,535 --> 14:45:20,952 So notice, it's kind of backwards. 15453 14:45:20,951 --> 14:45:23,241 It's top to bottom instead\nof bottom up, as we drew it 15454 14:45:23,241 --> 14:45:25,241 on the board the other\nday, and as we envisioned 15455 14:45:25,241 --> 14:45:27,042 stacks of trays in the cafeteria. 15456 14:45:27,042 --> 14:45:29,202 But this is your stack,\nof functions that 15457 14:45:29,201 --> 14:45:30,851 have been called, from top to bottom. 15458 14:45:30,851 --> 14:45:33,881 Get height is the most recent,\nmain is the very first 15459 14:45:35,722 --> 14:45:40,262 So let's try to do, let's try to do this\n 15460 14:45:41,262 --> 14:45:46,242 I'm going to go in here, and I'm\n 15461 14:45:46,241 --> 14:45:53,591 Whoops, try to do the following, except\n 15462 14:45:53,591 --> 14:45:57,161 then go ahead and say something,\nwell, like before, print 15463 14:45:57,161 --> 14:46:00,351 that's not an integer exclamation point. 15464 14:46:00,351 --> 14:46:03,281 But the difference this time is\nbecause I'm in a loop, the user 15465 14:46:03,281 --> 14:46:05,722 is going to have a chance\nto recover from this issue. 15466 14:46:05,722 --> 14:46:08,862 So if I run Mario.py, 3\nstill works as before. 15467 14:46:08,862 --> 14:46:12,402 If I run Mario.py and type\nin cat, I detect it now 15468 14:46:12,402 --> 14:46:15,762 and because I'm still in that loop,\n 15469 14:46:15,762 --> 14:46:19,572 because I've caught, so to speak, the\n 15470 14:46:19,572 --> 14:46:23,472 here, that's the way in Python\nto detect these kinds of errors 15471 14:46:23,472 --> 14:46:26,202 that would otherwise end up\nbeing on the user's own screen. 15472 14:46:26,201 --> 14:46:28,061 If I type in cat, dog,\nthat doesn't work. 15473 14:46:28,061 --> 14:46:33,341 If I type in, though, 2, I get my two\n 15474 14:46:33,341 --> 14:46:35,261 Are any questions on\nthis, and we're not going 15475 14:46:35,262 --> 14:46:37,272 to spend too much time on\nexceptions, but just wanted 15476 14:46:37,271 --> 14:46:40,201 to show you what's involved with\n 15477 14:46:40,701 --> 14:46:42,284 AUDIENCE: Then the hash marks in line. 15478 14:46:42,285 --> 14:46:43,827 DAVID J. MALAN: OK, so let's do this. 15479 14:46:43,826 --> 14:46:45,661 That actually comes to\nthe earlier question 15480 14:46:45,661 --> 14:46:47,581 about printing the\nhashes on the same line 15481 14:46:47,582 --> 14:46:50,330 or maybe something like this,\nwhere we have the little bricks 15482 14:46:50,330 --> 14:46:51,872 in the sky, or little question marks. 15483 14:46:51,872 --> 14:46:54,247 Let's recreate this idea,\nbecause the problem with print 15484 14:46:54,247 --> 14:46:57,452 as was noted earlier, is you're\n 15485 14:46:57,451 --> 14:46:58,981 But what if we don't want that. 15486 14:46:58,982 --> 14:47:01,262 Well, let's change\nthis program entirely. 15487 14:47:01,262 --> 14:47:02,832 Let me throw away all the functions. 15488 14:47:02,832 --> 14:47:05,742 Let's just go to a simpler world,\nwhere we're just doing this. 15489 14:47:05,741 --> 14:47:07,434 So let me start fresh in Mario.py. 15490 14:47:07,434 --> 14:47:09,641 I'm not going to bother with\nexceptions or functions. 15491 14:47:09,641 --> 14:47:15,932 Let's just do a very simple program, to\n 15492 14:47:15,932 --> 14:47:19,381 this time, because there are\nfour of these things in the sky. 15493 14:47:19,381 --> 14:47:21,752 Let's go ahead and just\nprint out a question mark 15494 14:47:21,752 --> 14:47:23,972 to represent each of those bricks. 15495 14:47:23,972 --> 14:47:27,662 Odds are you know this not going to end\n 15496 14:47:27,661 --> 14:47:30,971 as you've predicted, on separate lines. 15497 14:47:30,972 --> 14:47:33,902 So it turns out that the\nprint function actually 15498 14:47:33,902 --> 14:47:36,842 takes in multiple arguments, not\n 15499 14:47:36,841 --> 14:47:40,171 but also some additional arguments,\nthat allow you to specify 15500 14:47:40,171 --> 14:47:42,691 what the default line ending should be. 15501 14:47:42,692 --> 14:47:45,632 But what's interesting\nabout this is that, if you 15502 14:47:45,631 --> 14:47:49,151 want to change the line\nending to be something like 15503 14:47:49,152 --> 14:47:53,312 quote unquote, "that is\nnothing," instead of backslash n 15504 14:47:53,311 --> 14:47:55,831 this is not sufficient,\nbecause in Python, you 15505 14:47:55,832 --> 14:47:58,292 can have two types of\narguments, or parameters. 15506 14:47:58,292 --> 14:48:01,682 Some arguments are positional, which\n 15507 14:48:01,682 --> 14:48:03,211 a comma separated list of arguments. 15508 14:48:03,211 --> 14:48:06,061 And that's what we did all the time\n 15509 14:48:06,061 --> 14:48:08,186 comma, something, we did\nit in printf all the time 15510 14:48:08,186 --> 14:48:10,502 and in other functions that\ntook multiple arguments. 15511 14:48:10,502 --> 14:48:14,402 In Python, you have, not\nonly positional arguments 15512 14:48:14,402 --> 14:48:18,182 where you just separate them by commas,\n 15513 14:48:19,171 --> 14:48:22,741 There are also named arguments,\nwhich looks weird but is 15514 14:48:22,741 --> 14:48:24,661 helpful for reasons like this. 15515 14:48:24,661 --> 14:48:27,421 If you read the\ndocumentation, you will see 15516 14:48:27,421 --> 14:48:31,261 that there is a named argument\nthat Python accepts, called end. 15517 14:48:31,262 --> 14:48:34,202 And if you set that\nequal to something, that 15518 14:48:34,201 --> 14:48:36,721 will be used as the end\nof every line, instead 15519 14:48:36,722 --> 14:48:39,272 of the default, which the\ndocumentation will also say 15520 14:48:39,271 --> 14:48:41,221 is quote unquote backslash n. 15521 14:48:41,222 --> 14:48:45,522 So this line here has no effect\non my logic at the moment. 15522 14:48:45,521 --> 14:48:49,801 But if I change it to just quote\nunquote, essentially overriding 15523 14:48:49,802 --> 14:48:54,992 the default new line character, and\n 15524 14:48:55,800 --> 14:48:57,092 There's a bit of a bug, though. 15525 14:48:57,091 --> 14:49:00,131 My prompt is not meant\nto be on the same line. 15526 14:49:00,131 --> 14:49:02,161 So I can fix that by\njust printing nothing. 15527 14:49:02,161 --> 14:49:05,161 But, really, it's not nothing,\n 15528 14:49:05,161 --> 14:49:09,451 So let me run Python of\nMario.py again, and now we 15529 14:49:09,451 --> 14:49:12,661 have what I intended in the first\n 15530 14:49:13,692 --> 14:49:17,432 And this is just one example\nof an argument that has a name. 15531 14:49:17,432 --> 14:49:19,802 But this is a common\nparadigm in Python 2 15532 14:49:19,802 --> 14:49:22,772 to not just separate things by\ncommas, but to be very specific 15533 14:49:22,771 --> 14:49:27,331 because the print function might take\n 15534 14:49:27,332 --> 14:49:31,150 And my God, if you had to\nenumerate like 10 or 20 commas 15535 14:49:32,192 --> 14:49:34,109 You're going to get\nthings in the wrong order. 15536 14:49:34,108 --> 14:49:37,121 Named arguments allow you to\nbe resilient against that. 15537 14:49:37,122 --> 14:49:39,211 So you only specify\narguments by name, and it 15538 14:49:39,211 --> 14:49:42,525 doesn't matter what order they are in. 15539 14:49:42,525 --> 14:49:46,682 All right, any questions, then, on\n 15540 14:49:46,682 --> 14:49:50,792 And to be clear, you can do\nsomething like, very weird 15541 14:49:50,792 --> 14:49:56,432 but logically expected, like this, by\n 15542 14:49:56,432 --> 14:49:58,351 But the right way to\nsolve the Mario problem 15543 14:49:58,351 --> 14:50:02,173 would be just to override\nit to be nothing like this. 15544 14:50:02,173 --> 14:50:03,631 All right, how about this for cool. 15545 14:50:03,631 --> 14:50:05,521 And this is why a lot\nof people like Python. 15546 14:50:05,521 --> 14:50:06,961 Suppose you don't really like loops. 15547 14:50:06,961 --> 14:50:08,491 You don't really like\nthree-line programs 15548 14:50:08,491 --> 14:50:11,158 because that was kind of three\ntimes longer than it needs to be. 15549 14:50:11,158 --> 14:50:15,722 What if you just printed out\na question mark four times? 15550 14:50:15,722 --> 14:50:19,902 Python, whoops, Python of\nMario.py, that also works. 15551 14:50:19,902 --> 14:50:23,072 So it turns out that, just like\nthe plus operator in Python 15552 14:50:23,072 --> 14:50:27,091 can join things together,\nthe multiply operator is not 15553 14:50:28,362 --> 14:50:32,592 It actually means, take this and\nconcatenate it four times over. 15554 14:50:32,591 --> 14:50:35,521 So that's a way of just\ndistilling into one line what 15555 14:50:35,521 --> 14:50:39,271 would have otherwise taken multiple\n 15556 14:50:39,271 --> 14:50:43,651 lines in Python, but is really\nnow rather succinct in Python 15557 14:50:44,906 --> 14:50:48,031 Let's do one last Mario example, which\n 15558 14:50:48,031 --> 14:50:50,612 If this is another part\nof the Mario interface 15559 14:50:50,612 --> 14:50:53,322 this is like a grid of like\n3 by 3 bricks, for instance. 15560 14:50:53,322 --> 14:50:57,211 So two dimensions now, just not just\n 15561 14:50:57,211 --> 14:50:59,652 Let's print out something\nlike that, using hashes. 15562 14:50:59,652 --> 14:51:02,592 Well, how about, how do I do this. 15563 14:51:02,591 --> 14:51:05,731 So how about for i in range of 3. 15564 14:51:05,732 --> 14:51:10,802 Then I could do for j in range of\n 15565 14:51:10,802 --> 14:51:12,332 and that's reasonable for counting. 15566 14:51:12,332 --> 14:51:17,522 I could now print out a hash symbol,\n 15567 14:51:17,521 --> 14:51:24,182 Python of Mario.py, OK, that's\njust one crazy long column. 15568 14:51:24,182 --> 14:51:27,762 What do I need to fix and where\n 15569 14:51:27,762 --> 14:51:32,372 So 3 by 3 bricks, instead\nof one long column. 15570 14:51:32,972 --> 14:51:37,022 AUDIENCE: Why don't we create\na line and then we'll skip it. 15571 14:51:37,021 --> 14:51:39,971 DAVID J. MALAN: OK, so after\nprinting 3, we want to skip a line. 15572 14:51:39,972 --> 14:51:42,272 So maybe like print\nout a blank line here. 15573 14:51:43,262 --> 14:51:46,442 I like that instinct, right, print\n 15574 14:51:46,442 --> 14:51:48,781 Let's go ahead and run\nPython of Mario.py. 15575 14:51:48,781 --> 14:51:53,101 OK, it's more visible, what\nI'm doing, but still wrong. 15576 14:51:53,101 --> 14:51:55,631 What can I, what's the\nremaining fix, though? 15577 14:51:56,131 --> 14:51:59,311 AUDIENCE: So right behind the two. 15578 14:51:59,311 --> 14:52:02,201 DAVID J. MALAN: Yeah, I'm\ngetting an extra new line here 15579 14:52:02,201 --> 14:52:04,391 which I don't want\nwhile I'm on this row. 15580 14:52:04,391 --> 14:52:08,372 So let me do n equals quote unquote,\n 15581 14:52:10,472 --> 14:52:13,866 Python of Mario.py, voila, now\nwe've got it, in two dimensions. 15582 14:52:13,866 --> 14:52:15,241 And even this, we can tighten up. 15583 14:52:15,241 --> 14:52:17,741 Like, we could just use the\nlittle trick we learned. 15584 14:52:17,741 --> 14:52:21,752 So we could just say,\nprint a hash times 3 times 15585 14:52:21,752 --> 14:52:24,332 and we can get rid of one\nof those loops altogether. 15586 14:52:24,332 --> 14:52:27,452 All it's doing is, whoops, all it's\n 15587 14:52:27,451 --> 14:52:29,581 But, no, I don't want to do that. 15588 14:52:29,582 --> 14:52:31,353 What do I, how do I fix this here. 15589 14:52:31,353 --> 14:52:33,061 I don't think I want\nthis anymore, right? 15590 14:52:33,061 --> 14:52:34,871 Because that's giving\nme an extra new line. 15591 14:52:34,872 --> 14:52:37,782 So now this program is\nreally tightened up. 15592 14:52:37,781 --> 14:52:39,572 Same thing, two lines of code. 15593 14:52:39,572 --> 14:52:43,741 But we're now implementing this\n 15594 14:52:43,741 --> 14:52:46,961 All right, any questions here on these? 15595 14:52:47,461 --> 14:52:53,311 AUDIENCE: Is there any practical reason\n 15596 14:52:53,311 --> 14:52:56,371 the print function, you\ndon't put any spaces in it. 15597 14:52:56,372 --> 14:52:58,952 DAVID J. MALAN: If I\nprint n, any spaces. 15598 14:52:59,822 --> 14:53:01,961 AUDIENCE: Whenever we\nwrite n, for example 15599 14:53:01,961 --> 14:53:05,372 the print function\nis, you know, in order 15600 14:53:05,372 --> 14:53:10,342 to stop it from going to a new\nline, it seems like any spaces 15601 14:53:10,341 --> 14:53:14,322 we did like n equals and then too close. 15602 14:53:20,764 --> 14:53:24,552 So in a previous version, let me\n 15603 14:53:25,692 --> 14:53:28,242 The convention in Python\nis not to do that. 15604 14:53:28,872 --> 14:53:30,785 It just starts to add too much space. 15605 14:53:30,785 --> 14:53:32,952 And this is a little\ninconsistent, because, earlier 15606 14:53:32,951 --> 14:53:34,991 when we talked about\nlike pluses or spaces 15607 14:53:34,991 --> 14:53:37,271 around the less than or equal\nsigns, I did say add it. 15608 14:53:37,271 --> 14:53:39,531 Here it's actually\nclearer and recommended 15609 14:53:39,531 --> 14:53:40,781 to keep them tighter together. 15610 14:53:40,781 --> 14:53:44,082 Otherwise it just becomes harder\nto read where the gaps are. 15611 14:53:45,341 --> 14:53:50,878 All right, let's do, how about,\nanother five minute break. 15612 14:53:51,461 --> 14:53:54,254 And then we're going to dive into\n 15613 14:53:54,254 --> 14:53:57,682 and then ultimately build with some\n 15614 14:53:59,652 --> 14:54:04,781 All right, so almost all\nof the examples we just did 15615 14:54:04,781 --> 14:54:07,061 were recreations of\nwhat we did in week 1. 15616 14:54:07,061 --> 14:54:09,641 And recall that week 1 was like\nour most syntax-heavy week. 15617 14:54:09,641 --> 14:54:13,451 It was when we were first learning\n 15618 14:54:13,451 --> 14:54:16,421 we began to focus a bit\nmore on ideas, like arrays 15619 14:54:16,421 --> 14:54:18,161 and other higher-level constructs. 15620 14:54:18,161 --> 14:54:21,401 And we'll do that again here, condensing\n 15621 14:54:21,402 --> 14:54:23,772 into a fewer set of examples in Python. 15622 14:54:23,771 --> 14:54:26,541 And we'll culminate by actually\ntaking Python out for a spin 15623 14:54:26,542 --> 14:54:28,822 and doing things that\nwould be way harder to do 15624 14:54:28,822 --> 14:54:33,351 and way more time-consuming to do in C,\n 15625 14:54:33,351 --> 14:54:36,311 But how do you go about figuring\nout what functions exist 15626 14:54:36,311 --> 14:54:39,491 if you didn't hear it in\nclass, you don't see it online 15627 14:54:39,491 --> 14:54:43,002 but you want to see it officially, you\n 15628 14:54:44,741 --> 14:54:47,862 And I will disclaim that, honestly,\n 15629 14:54:49,271 --> 14:54:51,761 Google will often be your\nfriend, so googling something 15630 14:54:51,762 --> 14:54:55,872 you're interested in, to find your way\n 15631 14:54:55,872 --> 14:54:58,932 or StackOverflow.com is\nanother popular website. 15632 14:54:58,932 --> 14:55:01,302 As always, though, the\nline should be googling 15633 14:55:01,302 --> 14:55:04,122 things like, how do I convert\na string to lowercase. 15634 14:55:04,122 --> 14:55:05,592 Like that's reasonable to Google. 15635 14:55:05,591 --> 14:55:09,681 Or how to convert to uppercase or\n 15636 14:55:09,682 --> 14:55:14,472 But googling, of course, things like\n 15637 14:55:14,472 --> 14:55:15,641 of course, crosses the line. 15638 14:55:15,641 --> 14:55:18,599 But moving forward, and really with\n 15639 14:55:18,599 --> 14:55:20,741 and Stack Overflow are\nyour friends, but the line 15640 14:55:20,741 --> 14:55:23,061 is between the reasonable\nand the unreasonable. 15641 14:55:23,061 --> 14:55:26,411 So let me officially use the\nPython documentation search, just 15642 14:55:26,411 --> 14:55:29,051 to search for something\nlike the lowercase function. 15643 14:55:29,052 --> 14:55:31,062 Like, I know I can\nlowercase things in Python. 15644 14:55:32,502 --> 14:55:34,391 So let me just search\nfor the word lower. 15645 14:55:34,391 --> 14:55:37,332 You're going to get, often, an\noverwhelming number of results 15646 14:55:37,332 --> 14:55:40,200 because Python is a pretty big\n 15647 14:55:40,199 --> 14:55:42,491 And you're going to want to\nlook for familiar patterns. 15648 14:55:42,491 --> 14:55:45,581 For whatever reason,\nstring.lower, which is probably 15649 14:55:45,582 --> 14:55:48,942 more popular or more commonly used than\n 15650 14:55:48,942 --> 14:55:51,982 But it's purple, because I clicked\n 15651 14:55:51,982 --> 14:55:54,972 So str.lower is probably\nwhat I want, because I 15652 14:55:54,972 --> 14:55:57,582 am interested at the moment\nin lower casing strings. 15653 14:55:57,582 --> 14:56:01,779 When I click on that, this is an example\n 15654 14:56:02,322 --> 14:56:03,862 It's in this general format. 15655 14:56:03,862 --> 14:56:05,862 Here's my str.lower function. 15656 14:56:05,862 --> 14:56:08,061 This returns a copy of\nthe string, with all 15657 14:56:08,061 --> 14:56:10,271 of the cased characters\nconverted to lowercase 15658 14:56:10,271 --> 14:56:12,191 and the lower-casing\nalgorithm, dot dot dot. 15659 14:56:12,192 --> 14:56:13,690 So that doesn't give me much. 15660 14:56:13,690 --> 14:56:14,982 It doesn't give me sample code. 15661 14:56:14,982 --> 14:56:16,732 But it does say what the function does. 15662 14:56:16,732 --> 14:56:20,412 And if we keep looking, you'll see\n 15663 14:56:20,411 --> 14:56:24,641 I used its analog, Rstrip before, right\n 15664 14:56:24,641 --> 14:56:27,521 that is strip, from the end of a\n 15665 14:56:27,521 --> 14:56:29,451 like a new line, or even something else. 15666 14:56:29,451 --> 14:56:32,932 And if you scroll through\nstring, this web page here. 15667 14:56:32,932 --> 14:56:34,631 And we're halfway down the page already. 15668 14:56:34,631 --> 14:56:36,701 If you see my scroll\nbar, tiny on the right 15669 14:56:36,701 --> 14:56:41,771 there's a huge amount of functionality\n 15670 14:56:41,771 --> 14:56:44,981 And this is just testament to just\n 15671 14:56:44,982 --> 14:56:49,142 But it's also reason to\nreassure that the goal, when 15672 14:56:49,141 --> 14:56:51,391 playing around with some new\nlanguage and learning it 15673 14:56:51,391 --> 14:56:53,120 is not to learn it exhaustively. 15674 14:56:53,120 --> 14:56:54,912 Just like in English\nor any human language 15675 14:56:54,911 --> 14:56:57,161 there's always going to be\nvocab words you don't know 15676 14:56:57,161 --> 14:57:00,084 ways of presenting the same\ninformation in some language. 15677 14:57:00,084 --> 14:57:01,752 That's going to be the case with Python. 15678 14:57:01,752 --> 14:57:05,141 And what we'll do today and this\nweek in problem set 6 is really 15679 14:57:05,141 --> 14:57:06,641 get your footing with this language. 15680 14:57:06,641 --> 14:57:09,822 But you won't know all of Python,\n 15681 14:57:09,822 --> 14:57:12,822 And, honestly, you won't know all of\n 15682 14:57:12,822 --> 14:57:15,322 unless you're, perhaps, using\nthem full time professionally 15683 14:57:15,322 --> 14:57:18,891 and even then, there's more libraries\n 15684 14:57:18,891 --> 14:57:21,942 So let's actually now\npivot to a few other ideas 15685 14:57:21,942 --> 14:57:24,082 that we'll implement\nin Python, in a moment. 15686 14:57:24,082 --> 14:57:26,531 Let me switch back over to VS Code here. 15687 14:57:26,531 --> 14:57:31,781 And let me whip up, say, a recreation\n 15688 14:57:31,781 --> 14:57:34,404 where we averaged like\nthree scores together. 15689 14:57:34,404 --> 14:57:36,822 And that was an opportunity\nin week 2 to play with arrays 15690 14:57:36,822 --> 14:57:38,951 to realize how constrained arrays are. 15691 14:57:40,241 --> 14:57:41,561 You have to decide in advance. 15692 14:57:41,561 --> 14:57:43,631 But let's see what's\ndifferent here in Python. 15693 14:57:43,631 --> 14:57:48,101 So let me do Scores.py, and let\n 15694 14:57:48,101 --> 14:57:52,301 called scores, sorry, let me give myself\n 15695 14:57:52,302 --> 14:57:54,462 Set it equal to a list\nof three scores, which 15696 14:57:54,461 --> 14:57:59,082 are the same ones we've used\nbefore, 72, 73, 33, in this context 15697 14:57:59,082 --> 14:58:01,152 meant to be scores, not ASCII values. 15698 14:58:01,152 --> 14:58:03,042 And then let's just do\nthe average of these. 15699 14:58:03,042 --> 14:58:05,152 So average will be another variable. 15700 14:58:05,152 --> 14:58:09,432 And it turns out I can do, well,\nhow did I sum these before? 15701 14:58:09,432 --> 14:58:13,101 I probably had a for loop to add\n 15702 14:58:13,101 --> 14:58:16,101 Turns out in Python, you\ncan just say sum of scores 15703 14:58:16,101 --> 14:58:18,051 divided by the length of scores. 15704 14:58:18,052 --> 14:58:19,652 That's going to give me my average. 15705 14:58:19,652 --> 14:58:22,732 So sum is a function that takes\na list, in this case, as input 15706 14:58:22,732 --> 14:58:25,522 and it just does the sum for\nyou, with a for loop or whatever 15707 14:58:26,451 --> 14:58:30,002 Len gives you the length of the\nlist, how many things are in it. 15708 14:58:30,002 --> 14:58:31,762 So I can dynamically figure that out. 15709 14:58:31,762 --> 14:58:36,862 Now let me go ahead and print out,\n 15710 14:58:36,862 --> 14:58:40,150 in curly braces, the actual\naverage, close quote. 15711 14:58:40,150 --> 14:58:42,442 All right, so let's run this\ncode, Python of Scores.py. 15712 14:58:42,442 --> 14:58:47,572 And there is my average, in this\ncase, 59.33333 and so forth 15713 14:58:48,832 --> 14:58:51,022 Well, let's actually, now,\nchange this a little bit 15714 14:58:51,021 --> 14:58:54,146 and make it a little more interesting,\n 15715 14:58:54,146 --> 14:58:55,711 rather than hard coding this. 15716 14:58:55,711 --> 14:58:59,089 Let me go back up here and\nuse from CS50 import getInt 15717 14:58:59,089 --> 14:59:01,881 because I don't want to deal with\n 15718 14:59:01,881 --> 14:59:04,341 Like, I just want to use\nsomeone else's function here. 15719 14:59:04,341 --> 14:59:08,121 Let me give myself an\nempty list called scores. 15720 14:59:08,122 --> 14:59:11,002 And this is not something we\nwere able to do in C, right? 15721 14:59:11,002 --> 14:59:13,131 Because in C, if you tried\nto make an empty array 15722 14:59:13,131 --> 14:59:16,112 well, that's pretty stupid,\nbecause you can't add things to it. 15723 14:59:17,432 --> 14:59:19,171 So it wouldn't even let you do that. 15724 14:59:19,171 --> 14:59:22,161 But I can just create\nan empty list in Python 15725 14:59:22,161 --> 14:59:24,861 because lists, unlike arrays,\nare really lengthless. 15726 14:59:26,271 --> 14:59:29,391 But you and I are not dealing with\n 15727 14:59:31,292 --> 14:59:34,957 So now, let's go ahead and get a\n 15728 14:59:34,957 --> 14:59:36,332 How about three of them in total. 15729 14:59:36,332 --> 14:59:41,872 So for i in range of 3, let's go\n 15730 14:59:41,872 --> 14:59:44,332 using getInt, asking them for score. 15731 14:59:44,332 --> 14:59:51,362 And then let's go ahead and append, to\n 15732 14:59:51,362 --> 14:59:53,722 So it turns out that a list,\nand I could read the Python 15733 14:59:53,722 --> 14:59:57,802 documentation to confirm as much,\n 15734 14:59:57,802 --> 15:00:01,677 and functions built into objects\nare generally known as methods 15735 15:00:01,677 --> 15:00:03,052 if you've heard that term before. 15736 15:00:03,052 --> 15:00:05,842 Same idea, but whereas a function\nkind of stands on its own 15737 15:00:05,841 --> 15:00:09,951 a method is a function built\ninto an object, like a list here. 15738 15:00:09,951 --> 15:00:12,438 That's going to achieve the\nsame result. Strictly speaking 15739 15:00:13,521 --> 15:00:17,125 Just like in C, I could tighten this\n 15740 15:00:17,125 --> 15:00:19,042 But, I don't know, I\nkind of like it this way. 15741 15:00:19,042 --> 15:00:22,491 It's more clear, to me, at least, that\n 15742 15:00:22,491 --> 15:00:24,360 and then appending it to the list. 15743 15:00:24,360 --> 15:00:26,152 Now the rest of the\ncode can stay the same. 15744 15:00:26,152 --> 15:00:31,222 Python of Scores.py,\nscore will be 72, 73, 33. 15745 15:00:32,341 --> 15:00:35,361 But now the program's a little\nmore dynamic, which is nice. 15746 15:00:35,362 --> 15:00:37,461 But there's other\nsyntax I could use here. 15747 15:00:37,461 --> 15:00:40,851 Just so you've seen it, Python does\n 15748 15:00:40,851 --> 15:00:43,371 whereby, if you don't\nwant to do scores.append 15749 15:00:43,372 --> 15:00:47,812 you can actually say scores\nplus equals this score. 15750 15:00:47,811 --> 15:00:52,252 So you can actually concatenate\nlists together in Python 2. 15751 15:00:52,252 --> 15:00:54,862 Just as we used plus to\njoin two strings together 15752 15:00:54,862 --> 15:00:57,921 you can use plus to\njoin two lists together. 15753 15:00:57,921 --> 15:01:00,561 The catch is, you need\nto put the one score I'm 15754 15:01:00,561 --> 15:01:03,292 adding here in a list of its\nown, which is kind of silly. 15755 15:01:03,292 --> 15:01:07,851 But it's necessary, so that this\n 15756 15:01:07,851 --> 15:01:10,491 To do this more verbosely,\nwhich most programmers wouldn't 15757 15:01:10,491 --> 15:01:12,831 do, but just for clarity,\nthis is the same thing 15758 15:01:12,832 --> 15:01:15,472 as saying scores plus this score. 15759 15:01:15,472 --> 15:01:19,432 So now maybe it's a little more\n 15760 15:01:19,432 --> 15:01:24,201 plural, sorry, singular, are both\n 15761 15:01:25,381 --> 15:01:28,261 So two different ways, not sure\none is better than the other. 15762 15:01:28,262 --> 15:01:34,162 This way is pretty common, but .append\n 15763 15:01:34,161 --> 15:01:36,861 All right, how about another\nexample from week two. 15764 15:01:36,862 --> 15:01:39,592 This one was called uppercase. 15765 15:01:39,591 --> 15:01:42,841 So let me do this in\nUppercase.py, though, this time. 15766 15:01:42,841 --> 15:01:46,701 And let me import from\nCS50, get string again. 15767 15:01:46,701 --> 15:01:50,541 And let me go ahead and say,\nbefore will be my first variable. 15768 15:01:50,542 --> 15:01:54,022 Let me get a string from the user,\n 15769 15:01:54,021 --> 15:01:59,182 And then let me go ahead and say,\n 15770 15:01:59,182 --> 15:02:01,711 upper-casing to this string. 15771 15:02:01,711 --> 15:02:04,372 Let me change my line ending to\nbe that, using our new trick. 15772 15:02:04,372 --> 15:02:08,012 And this is where things get cool\n 15773 15:02:08,012 --> 15:02:11,572 If I want to iterate over all\nof the characters in a string 15774 15:02:11,572 --> 15:02:14,661 and print them out in uppercase,\n 15775 15:02:14,661 --> 15:02:22,553 For c in the before string, go ahead and\n 15776 15:02:22,553 --> 15:02:25,761 but don't end the line yet, because I\n 15777 15:02:28,012 --> 15:02:31,492 Python of Uppercase.py, let me\ntype in Hello in all lowercase. 15778 15:02:31,491 --> 15:02:33,531 I've just upper-cased the whole string. 15779 15:02:34,222 --> 15:02:36,652 I first get string, calling it before. 15780 15:02:36,652 --> 15:02:39,202 I then just print out some fluffy\ntext that says after colon 15781 15:02:39,201 --> 15:02:41,362 and I get rid of the line ending,\n 15782 15:02:41,362 --> 15:02:43,154 Notice I hit the spacebar\na couple of times 15783 15:02:43,154 --> 15:02:45,141 just so letters line up to be pretty. 15784 15:02:45,141 --> 15:02:47,302 For c and before, this is new. 15785 15:02:47,302 --> 15:02:51,022 This is powerful in C,\nsorry, in Python, whereby 15786 15:02:51,021 --> 15:02:54,111 you don't have to do like Int i\nequals 0 and i less than this 15787 15:02:54,112 --> 15:02:58,832 you could just say, for c in the\n 15788 15:02:58,832 --> 15:03:02,031 And then here is just upper-casing\nthat specific character 15789 15:03:02,031 --> 15:03:04,222 and making sure we don't\noutput a new line too soon. 15790 15:03:04,222 --> 15:03:06,442 But this is actually more\nwork than I need to do. 15791 15:03:06,442 --> 15:03:10,522 Based on what we've seen thus far,\n 15792 15:03:10,521 --> 15:03:12,141 can I tighten this up further? 15793 15:03:12,141 --> 15:03:16,862 Can I collapse lines 5 and 6,\nmaybe even 7, all together? 15794 15:03:16,862 --> 15:03:23,072 If the goal of this program is just\n 15795 15:03:27,002 --> 15:03:28,809 AUDIENCE: Would it be str.upper? 15796 15:03:28,809 --> 15:03:31,141 DAVID J. MALAN: Str.upper,\nyeah, so I could do something 15797 15:03:31,141 --> 15:03:34,021 like this, after gets before.upper. 15798 15:03:34,021 --> 15:03:36,271 So it's not stir\nliterally dot upper, stir 15799 15:03:36,271 --> 15:03:38,021 just represents the string in question. 15800 15:03:38,021 --> 15:03:41,141 So it would be before.upper,\nbut right idea otherwise. 15801 15:03:41,141 --> 15:03:44,652 And so let me go ahead and just tweak\n 15802 15:03:44,652 --> 15:03:49,332 Let me just go ahead and print out the\n 15803 15:03:49,332 --> 15:03:51,961 So this line is the same, I'm\ngetting a string called before. 15804 15:03:51,961 --> 15:03:55,051 I'm creating another variable\ncalled after, and, as you propose 15805 15:03:55,052 --> 15:03:58,482 I'm calling upper on the whole\n 15806 15:03:59,881 --> 15:04:03,871 And, again, in Python, there aren't\n 15807 15:04:03,872 --> 15:04:05,282 There's only strings, anyway. 15808 15:04:05,281 --> 15:04:07,121 So I might as well do them all at once. 15809 15:04:07,122 --> 15:04:10,742 So if I rerun the code now,\nPython of Uppercase.py. 15810 15:04:10,741 --> 15:04:15,601 Now I'll type in Hello in all\nlowercase, and, oh, so close 15811 15:04:15,601 --> 15:04:18,631 I think I can get rid of\nthis override, because I'm 15812 15:04:18,631 --> 15:04:22,031 printing the whole thing out at\n 15813 15:04:22,031 --> 15:04:26,402 So now if I type in Hello before,\n 15814 15:04:28,601 --> 15:04:32,432 All right, any questions,\nthen, on lists or on strings 15815 15:04:32,432 --> 15:04:37,762 and what this kind of function,\n 15816 15:04:38,262 --> 15:04:41,281 All right, so a couple other\nbuilding blocks before we start. 15817 15:04:44,531 --> 15:04:46,572 DAVID J. MALAN: To the right, right. 15818 15:04:47,561 --> 15:04:53,723 AUDIENCE: Could you write, very close to\n 15819 15:04:53,724 --> 15:04:55,779 you start creating a variable upper. 15820 15:04:55,779 --> 15:04:58,362 DAVID J. MALAN: Yes, do I have\nto create this variable, upper? 15821 15:04:59,112 --> 15:05:01,391 I could actually tighten\nthis up, and, if you really 15822 15:05:01,391 --> 15:05:04,692 want to see something neat,\ninside of the curly braces 15823 15:05:04,692 --> 15:05:07,572 you don't have to just put\nthe names of variables. 15824 15:05:07,572 --> 15:05:10,122 You can put a small\namount of logic, so long 15825 15:05:10,122 --> 15:05:13,302 as it doesn't start to look stupid and\n 15826 15:05:13,302 --> 15:05:15,462 that it's sort of bad\ndesign at that point. 15827 15:05:15,461 --> 15:05:17,061 I can tighten this up like this. 15828 15:05:17,061 --> 15:05:21,131 And now we're in Python of\nUppercase.py, writing Hello again. 15829 15:05:22,252 --> 15:05:23,802 But I would be careful about this. 15830 15:05:23,802 --> 15:05:27,005 You want to resist the temptation of\n 15831 15:05:27,004 --> 15:05:29,921 inside the curly braces, because\n 15832 15:05:29,921 --> 15:05:32,411 But, absolutely, you\ncould indeed do that, too. 15833 15:05:32,411 --> 15:05:35,471 All right, how about command line\narguments, which was one thing 15834 15:05:35,472 --> 15:05:39,552 we introduced in week two also, so\n 15835 15:05:39,552 --> 15:05:43,272 to take input from the user, whoops. 15836 15:05:43,271 --> 15:05:46,791 So we could actually take input\n 15837 15:05:46,792 --> 15:05:49,732 so as to take literally\ncommand line arguments. 15838 15:05:49,732 --> 15:05:52,542 These are a little different,\nbut it follows the same paradigm. 15839 15:05:52,542 --> 15:05:56,381 There's no main by default.\nAnd there's no Def main int 15840 15:05:56,381 --> 15:06:02,572 arg c char, or we called it string,\n 15841 15:06:02,572 --> 15:06:07,031 So if you want access to the\n 15842 15:06:07,031 --> 15:06:11,621 And it turns out, there's another\n 15843 15:06:11,622 --> 15:06:15,702 called CIS, and you can import from\n 15844 15:06:15,701 --> 15:06:17,879 So same idea, different place. 15845 15:06:17,879 --> 15:06:19,461 Now I'm going to go ahead and do this. 15846 15:06:19,461 --> 15:06:24,341 Let's write a program that just requires\n 15847 15:06:24,341 --> 15:06:26,572 after the program's\nname, or none at all. 15848 15:06:26,572 --> 15:06:33,192 So if the length of argv equals 2,\n 15849 15:06:33,192 --> 15:06:41,610 Hello comma argv bracket 1 close quote,\n 15850 15:06:41,610 --> 15:06:44,652 total at the prompt, let's just say\n 15851 15:06:45,682 --> 15:06:48,701 So the only thing that's new here\n 15852 15:06:48,701 --> 15:06:51,971 and we're using this fancy f-string\n 15853 15:06:51,972 --> 15:06:55,031 too, it's putting more complex\nlogic in the curly braces. 15854 15:06:55,792 --> 15:07:00,412 In this case, it's a list called argv,\n 15855 15:07:00,411 --> 15:07:04,301 Let's do Python of Argv.py,\nEnter, Hello, world. 15856 15:07:04,302 --> 15:07:08,002 What if I do Argv.py\nDavid at the command line. 15857 15:07:09,252 --> 15:07:11,201 So there's one curiosity here. 15858 15:07:11,201 --> 15:07:15,896 Python is not included in\nargv, whereas in C, dot 15859 15:07:15,896 --> 15:07:18,461 slash whatever was the first thing. 15860 15:07:18,461 --> 15:07:22,031 If the analog in Python is that\nthe name of your Python program 15861 15:07:22,031 --> 15:07:26,322 is the first thing, in bracket 0,\n 15862 15:07:26,322 --> 15:07:32,262 the word Python does not appear in\n 15863 15:07:32,262 --> 15:07:34,512 But otherwise, the\nidea of these arguments 15864 15:07:34,512 --> 15:07:36,904 is exactly the same as before. 15865 15:07:36,904 --> 15:07:39,072 And in fact, what you can\ndo, which is kind of cool 15866 15:07:39,072 --> 15:07:42,252 is, because argv is a list,\nyou can do things like this. 15867 15:07:42,252 --> 15:07:47,412 For arg in argv, go ahead\nand print out each argument. 15868 15:07:47,411 --> 15:07:49,511 So instead of using a\nfor loop and i and all 15869 15:07:49,512 --> 15:07:53,742 of this, if I do Python of argv Enter,\n 15870 15:07:53,741 --> 15:07:58,481 If I do Python of argv Foo,\nit puts Argv.py and Foo. 15871 15:07:58,482 --> 15:08:03,042 If I do, sorry, if I do Foo and\nbar, those words all print out. 15872 15:08:03,042 --> 15:08:05,292 If I do Foobar baz, those print out too. 15873 15:08:05,292 --> 15:08:08,351 And Foo and bar or baz are like\na mathematician's x and y and z 15874 15:08:08,351 --> 15:08:11,722 for computer scientists, when you\n 15875 15:08:12,942 --> 15:08:16,542 It reads a little more like English, and\n 15876 15:08:16,542 --> 15:08:20,052 allows you to iterate very quickly\n 15877 15:08:20,052 --> 15:08:22,692 Suppose I only wanted the real\nwords that the human typed 15878 15:08:23,771 --> 15:08:26,981 Like, suppose I want to ignore Argv.py. 15879 15:08:26,982 --> 15:08:30,162 I mean I could do something\nhackish like this. 15880 15:08:30,161 --> 15:08:35,626 If arg equals Argv.py,\nI could just ignore 15881 15:08:35,627 --> 15:08:37,002 you know, let's invert the logic. 15882 15:08:37,002 --> 15:08:39,052 I could do this, for instance. 15883 15:08:39,052 --> 15:08:41,622 So if the arg does not\nequal the program name 15884 15:08:41,622 --> 15:08:44,412 then go ahead and print out the word. 15885 15:08:44,411 --> 15:08:46,361 So I get Foobar and baz only. 15886 15:08:46,362 --> 15:08:50,921 Or, this is what's kind of neat\n 15887 15:08:50,921 --> 15:08:54,921 And let me just take a slice of\nthe array of the list instead. 15888 15:08:54,921 --> 15:08:59,332 So it turns out, if argv is\na list, I can actually say 15889 15:08:59,332 --> 15:09:03,582 you know what, go into that list,\n 15890 15:09:03,582 --> 15:09:05,722 and then go all the way to the end. 15891 15:09:05,722 --> 15:09:08,322 And we have not seen this\nsyntax in C. But this 15892 15:09:08,322 --> 15:09:10,932 is a way of slicing a list in Python. 15893 15:09:12,341 --> 15:09:17,381 If I run Python of\nArgv.py, Foo bar baz Enter 15894 15:09:17,381 --> 15:09:21,252 I get only a subset of the\nlist, starting at position 1 15895 15:09:21,252 --> 15:09:23,413 going all of the way to the end. 15896 15:09:23,413 --> 15:09:25,121 And you can even do\nkind of the opposite. 15897 15:09:25,122 --> 15:09:27,852 If, for whatever reason, you\nwant to ignore the last element 15898 15:09:27,851 --> 15:09:33,551 you can say colon, we\ncould say colon negative 1 15899 15:09:33,552 --> 15:09:36,082 and use a negative number,\nwhich we've not seen before 15900 15:09:36,082 --> 15:09:38,992 which slices off the end\nof the list, as well. 15901 15:09:38,991 --> 15:09:42,521 So there's some syntactic tricks\n 15902 15:09:42,521 --> 15:09:46,661 even if at first glance, you might\n 15903 15:09:46,661 --> 15:09:49,319 All right, let's do one\nother example with exit 15904 15:09:49,319 --> 15:09:51,612 and then we'll start actually\napplying some algorithms 15905 15:09:51,612 --> 15:09:52,737 to make things interesting. 15906 15:09:52,737 --> 15:09:56,991 So in one last program here, let's do\n 15907 15:09:56,991 --> 15:09:58,731 before we introduce some algorithms. 15908 15:10:00,741 --> 15:10:05,421 Let's import from CIS, import argv. 15909 15:10:07,012 --> 15:10:09,722 Let's make sure the user gives\nme one command line argument. 15910 15:10:09,722 --> 15:10:16,101 So if the length of argv does not\n 15911 15:10:16,101 --> 15:10:19,311 and print out something like\nmissing command line argument 15912 15:10:19,311 --> 15:10:21,112 just to explain what the problem is. 15913 15:10:25,101 --> 15:10:27,231 But I'm going to use a\nbetter version of exit here. 15914 15:10:27,232 --> 15:10:29,422 Let me import two functions from CIS. 15915 15:10:29,421 --> 15:10:33,561 Turns out the better way to do this is\n 15916 15:10:33,561 --> 15:10:36,514 specifically 2, with this exit code. 15917 15:10:36,514 --> 15:10:38,932 Otherwise, down here, I'm going\nto go ahead and print out 15918 15:10:38,932 --> 15:10:43,340 something like Hello, comma\nargv bracket 1, same as before. 15919 15:10:43,339 --> 15:10:44,881 And then I'm going to exit with zero. 15920 15:10:44,881 --> 15:10:46,932 So, again, this was a\nsubtle thing we introduced 15921 15:10:46,932 --> 15:10:49,432 in week two, where you can\nactually have your programs exit 15922 15:10:49,432 --> 15:10:51,951 with some number, where\n0 signifies success 15923 15:10:51,951 --> 15:10:53,871 and anything else signifies error. 15924 15:10:53,872 --> 15:10:55,762 This is just the same idea in Python. 15925 15:10:55,762 --> 15:11:00,442 So if I, for instance, just run the\n 15926 15:11:00,442 --> 15:11:03,141 I meant to say exit here and exit here. 15927 15:11:04,232 --> 15:11:07,022 If I run this like this, I'm\nmissing a command line argument. 15928 15:11:07,021 --> 15:11:09,721 So let me rerun it with\nlike my name at the prompt. 15929 15:11:09,722 --> 15:11:13,552 So I have exactly two command line\n 15930 15:11:14,572 --> 15:11:16,864 And if I do David Malan, it's\nnot going to work either 15931 15:11:16,864 --> 15:11:18,682 because now argv does not equal 2. 15932 15:11:18,682 --> 15:11:21,381 But the difference here is\nthat we're exiting with 1 15933 15:11:21,381 --> 15:11:26,421 so that special programs can detect an\n 15934 15:11:26,421 --> 15:11:28,701 And now there's one other\nway to do this, too. 15935 15:11:28,701 --> 15:11:30,981 Suppose that you're\nimporting a lot of functions 15936 15:11:30,982 --> 15:11:33,465 and you don't really want\nto make a mess of things 15937 15:11:33,464 --> 15:11:35,631 and just have all of these\nfunction names available 15938 15:11:35,631 --> 15:11:38,151 without it being clear\nwhere they came from. 15939 15:11:38,152 --> 15:11:39,982 Let's just import all of CIS. 15940 15:11:39,982 --> 15:11:43,702 And let's just change our syntax,\n 15941 15:11:43,701 --> 15:11:46,491 where we just prepend to all\nof these library functions 15942 15:11:46,491 --> 15:11:49,941 CIS, just to be super-explicit\nwhere they came from 15943 15:11:49,942 --> 15:11:55,359 and if there's another\nexit or argv value 15944 15:11:55,358 --> 15:11:58,441 that we want to import from a library,\n 15945 15:11:58,442 --> 15:12:01,672 So if I do it one last time here,\nmissing command line argument. 15946 15:12:01,671 --> 15:12:03,711 But David still actually worked. 15947 15:12:03,711 --> 15:12:06,771 All right, only to demonstrate how\n 15948 15:12:06,771 --> 15:12:09,651 Let's now do something more\npowerful, like a search algorithm 15949 15:12:10,553 --> 15:12:13,011 I'm going to go ahead and open\nup a file called Numbers.py 15950 15:12:13,012 --> 15:12:16,942 and let's just do some searching\nor linear search, rather 15951 15:12:20,582 --> 15:12:23,572 How about import CIS as before. 15952 15:12:23,572 --> 15:12:29,362 Let me give myself a list of\nnumbers, like 4, 6, 8, 2, 7, 5, 0 15953 15:12:29,362 --> 15:12:31,192 so just a bunch of integers. 15954 15:12:32,692 --> 15:12:36,112 If you recall from week three,\nwe searched for the number 0 15955 15:12:36,112 --> 15:12:38,402 at the end of the lockers on stage. 15956 15:12:38,402 --> 15:12:40,641 So let's just ask that\nquestion in Python. 15957 15:12:40,641 --> 15:12:42,381 No need for a loop or\nanything like that. 15958 15:12:42,381 --> 15:12:46,072 If 0 is in the numbers, go\nahead and print out found. 15959 15:12:46,072 --> 15:12:49,942 And then let's just exit successfully,\n 15960 15:12:49,942 --> 15:12:52,192 let's just say print not found. 15961 15:12:52,192 --> 15:12:55,732 And then we'll CIS exit with 1. 15962 15:12:55,732 --> 15:12:58,342 So this is where Python\nstarts to get powerful again. 15963 15:12:59,572 --> 15:13:02,254 Here is your loop, that's doing\nall of the checking for you. 15964 15:13:02,254 --> 15:13:04,671 Underneath the hood, Python\nis going to use linear search. 15965 15:13:04,671 --> 15:13:06,338 You don't have to implement it yourself. 15966 15:13:06,338 --> 15:13:08,841 No while loop, no for loop,\nyou just ask a question. 15967 15:13:08,841 --> 15:13:12,752 If 0 is in numbers,\nthen do the following. 15968 15:13:12,752 --> 15:13:14,872 So that's one feature\nwe now get with Python 15969 15:13:14,872 --> 15:13:16,862 and get to throw away\na lot of that code. 15970 15:13:16,862 --> 15:13:18,351 We can do it with strings, too. 15971 15:13:18,351 --> 15:13:21,362 Let me open a file\ncalled Names.py instead 15972 15:13:21,362 --> 15:13:23,512 and do something that was\neven more involved in C 15973 15:13:23,512 --> 15:13:26,542 because we needed Str Comp and\nthe for loop, and so forth. 15974 15:13:26,542 --> 15:13:28,522 Let me import CIS for this file. 15975 15:13:28,521 --> 15:13:30,981 Let's give myself a bunch\nof names like we did in C. 15976 15:13:30,982 --> 15:13:38,152 And those were Bill and Charlie\nand Fred and George and Ginny 15977 15:13:38,152 --> 15:13:41,961 and two more, Percy, and lastly Ron. 15978 15:13:41,961 --> 15:13:43,911 And recall, at the\ntime, we looked for Ron. 15979 15:13:43,911 --> 15:13:45,953 And so we had to iterate\nthrough the whole thing 15980 15:13:45,953 --> 15:13:48,331 doing Str Comp and i plus\nplus and all of that. 15981 15:13:48,332 --> 15:13:55,281 Now just ask the question, if Ron\n 15982 15:13:55,281 --> 15:13:56,961 and, whoops, let me hide that. 15983 15:13:58,771 --> 15:14:02,701 Let me go ahead and say\nprint, found, as before. 15984 15:14:02,701 --> 15:14:06,231 CIS exit 1, just to indicate\nsuccess, and then down here 15985 15:14:06,232 --> 15:14:09,362 if we get to this point,\nwe can say not found. 15986 15:14:09,362 --> 15:14:12,692 And then we'll just CIS exit 1 instead. 15987 15:14:12,692 --> 15:14:17,482 So, again, this just does linear search\n 15988 15:14:17,482 --> 15:14:20,932 we found Ron, because, indeed, he's\n 15989 15:14:20,932 --> 15:14:24,711 But we don't need to deal with\nall of the mechanics of it. 15990 15:14:24,711 --> 15:14:27,051 All right, let's take\nthings one step further. 15991 15:14:27,052 --> 15:14:29,362 In week three, we also\nimplemented the idea 15992 15:14:29,362 --> 15:14:33,502 of a phone book, that actually\nassociated keys with values. 15993 15:14:33,502 --> 15:14:36,531 But remember, the phone book in\nC, was kind of a hack, right? 15994 15:14:36,531 --> 15:14:40,042 Because we first had two arrays,\n 15995 15:14:40,042 --> 15:14:43,851 Then we introduced structs, and\n 15996 15:14:43,851 --> 15:14:47,421 And then we had an array of persons. 15997 15:14:47,421 --> 15:14:51,561 You can do this in Python, using\n 15998 15:14:51,561 --> 15:14:54,191 But we can also just use a\ngeneral purpose dictionary 15999 15:14:54,192 --> 15:14:57,942 because just like in P set 5, you\n 16000 15:14:59,622 --> 15:15:02,922 Well, similarly, can\nPython just do this for us. 16001 15:15:02,921 --> 15:15:05,771 From CS50, let's import get string. 16002 15:15:05,771 --> 15:15:09,281 And now let's give myself\na dictionary of people 16003 15:15:09,281 --> 15:15:13,061 D-I-C-T () open paren closed\nparen gives you a dictionary. 16004 15:15:13,061 --> 15:15:15,822 Or you can simplify\nthe syntax, actually 16005 15:15:15,822 --> 15:15:18,881 and a dictionary again is just keys\n 16006 15:15:18,881 --> 15:15:21,581 You can also just use\ncurly braces instead. 16007 15:15:21,582 --> 15:15:23,542 That gives me an empty dictionary. 16008 15:15:23,542 --> 15:15:26,921 But if I know what I want to put in it\n 16009 15:15:26,921 --> 15:15:34,311 with a number of plus 1-617-495-1000,\n 16010 15:15:34,311 --> 15:15:40,298 David, with plus 1-949-468-2750. 16011 15:15:40,298 --> 15:15:42,881 And it came to my attention,\ntragically, after class that day 16012 15:15:42,881 --> 15:15:44,673 that we had a bug in\nour little Easter egg. 16013 15:15:44,673 --> 15:15:47,711 If today, you would like to call\nme or text me, at that number 16014 15:15:47,711 --> 15:15:50,652 we have fixed the code that\nunderlies that little Easter egg. 16015 15:15:51,612 --> 15:15:53,561 All right, so this now\ngives me a variable 16016 15:15:53,561 --> 15:15:57,641 called people, that's\nassociating keys with values. 16017 15:15:57,641 --> 15:16:01,752 There is some new syntax here in\n 16018 15:16:01,752 --> 15:16:04,811 but the colons, and the quotes\non the left and the right. 16019 15:16:04,811 --> 15:16:07,901 This is a way, in Python,\nof associating keys 16020 15:16:07,902 --> 15:16:11,872 with values, words with definitions,\n 16021 15:16:11,872 --> 15:16:15,072 And it's going to be a super-common\n 16022 15:16:15,072 --> 15:16:18,972 when we look at CSS and HTML and\n 16023 15:16:18,972 --> 15:16:22,362 are like this omnipresent idea in\n 16024 15:16:22,362 --> 15:16:25,822 because it's just a really useful way\n 16025 15:16:25,822 --> 15:16:29,211 So, at this point in the story, we\n 16026 15:16:29,211 --> 15:16:32,711 if you will, of people, associating\nnames with phone numbers 16027 15:16:32,711 --> 15:16:34,197 just like a real world phone book. 16028 15:16:34,197 --> 15:16:37,722 So let's write a program that gets\n 16029 15:16:37,722 --> 15:16:39,912 whose number they would like to look up. 16030 15:16:39,911 --> 15:16:46,031 Then, let's go ahead and say, if that\n 16031 15:16:46,031 --> 15:16:48,612 go ahead and print out\nthat person's number 16032 15:16:48,612 --> 15:16:51,252 by going into the people\ndictionary and going 16033 15:16:51,252 --> 15:16:56,002 to that specific name, within there,\n 16034 15:16:56,002 --> 15:16:58,482 So this is similar in spirit to before. 16035 15:16:58,482 --> 15:17:02,652 Linear search and dictionary lookups\n 16036 15:17:02,652 --> 15:17:05,802 in Python, by just asking the\nquestion, if name and people. 16037 15:17:05,802 --> 15:17:07,692 And this line is just\ngoing to print out 16038 15:17:07,692 --> 15:17:12,232 whoever is in the people\ndictionary, at that name. 16039 15:17:12,232 --> 15:17:16,722 So I'm using square brackets, because\n 16040 15:17:16,722 --> 15:17:19,842 just like you can index into\nan array, or a list in Python 16041 15:17:19,841 --> 15:17:24,671 using numbers, 0, 1, 2, you\ncan very conveniently index 16042 15:17:24,671 --> 15:17:29,601 into a dictionary in Python,\nusing square brackets, as well. 16043 15:17:29,601 --> 15:17:32,591 And just to make clear what's\ngoing on here, let me go 16044 15:17:32,591 --> 15:17:37,002 and create a temporary variable,\n 16045 15:17:37,002 --> 15:17:41,531 And then let's just, or, sorry, let's\n 16046 15:17:41,531 --> 15:17:44,411 And that will just print\nout the number in question. 16047 15:17:44,411 --> 15:17:48,371 In C, and previously in Python,\n 16048 15:17:48,372 --> 15:17:53,472 would have been go to a location in\n 16049 15:17:53,472 --> 15:17:57,311 But that can actually be a string,\n 16050 15:17:57,311 --> 15:17:59,351 And this is what's amazing\nabout dictionaries 16051 15:17:59,351 --> 15:18:02,411 it's not like a big\nline, a big linear thing. 16052 15:18:02,411 --> 15:18:05,261 It's this table, that you can\nlook up in one column the name 16053 15:18:05,262 --> 15:18:07,582 and get back in the\nother column the number. 16054 15:18:07,582 --> 15:18:09,641 So let's go ahead and run\nPython of Phonebook.py 16055 15:18:14,622 --> 15:18:18,402 That's not what's\nsupposed to happen at all. 16056 15:18:18,402 --> 15:18:19,961 I think I'm in the wrong play. 16057 15:18:32,351 --> 15:18:36,491 Python of Phonebook.py, what the-- 16058 15:18:55,661 --> 15:18:57,776 What am I not understanding here? 16059 15:19:00,701 --> 15:19:03,869 OK, Roxanne, Carter, do you\nsee what I'm doing wrong? 16060 15:19:10,752 --> 15:19:14,631 SPEAKER 47: When you found the test\n 16061 15:19:14,631 --> 15:19:19,911 DAVID J. MALAN: Oh, yeah, found,\nOK, we're going to do this. 16062 15:19:31,881 --> 15:19:33,792 All this is coming out of the video. 16063 15:19:42,805 --> 15:19:44,722 I will try to figure out\nwhat was going wrong. 16064 15:19:44,722 --> 15:19:47,322 The best I can tell, it was\nrunning the wrong program. 16065 15:19:47,322 --> 15:19:49,341 I don't quite understand why. 16066 15:19:49,341 --> 15:19:50,691 So we will diagnose this later. 16067 15:19:50,692 --> 15:19:53,484 I just put the file into a temporary\n 16068 15:19:53,483 --> 15:19:59,231 So let me go ahead and just run\nthis, Python of Phonebook.py 16069 15:19:59,232 --> 15:20:00,762 type in, for instance, my name. 16070 15:20:00,762 --> 15:20:02,940 And there's my corresponding number. 16071 15:20:02,940 --> 15:20:04,482 Have no idea what was just happening. 16072 15:20:04,482 --> 15:20:06,582 But I will get to the\nbottom of it and update you 16073 15:20:06,582 --> 15:20:07,882 if we can put our finger on it. 16074 15:20:07,881 --> 15:20:11,411 So this was just an example, now,\nof implementing a phone book. 16075 15:20:11,411 --> 15:20:14,111 Let's now consider what we\ncan do that's a little more 16076 15:20:14,112 --> 15:20:16,932 powerful, in these examples,\nlike a phone book that 16077 15:20:16,932 --> 15:20:18,671 actually keeps this information around. 16078 15:20:18,671 --> 15:20:22,031 Thus far, these simple phone book\n 16079 15:20:22,031 --> 15:20:25,301 But using CSV files,\ncomma separated values 16080 15:20:25,302 --> 15:20:28,077 maybe we could actually keep\naround the names and numbers 16081 15:20:28,076 --> 15:20:29,951 so that, like on your\nphone, you can actually 16082 15:20:29,951 --> 15:20:32,301 keep your contacts around long-term. 16083 15:20:32,302 --> 15:20:35,582 So I'm going to go ahead now and\n 16084 15:20:35,582 --> 15:20:39,762 And let me just hide this\ndetail, so it's not confusing. 16085 15:20:39,762 --> 15:20:43,152 Whoops, I'm going to change\nmy prompt temporarily. 16086 15:20:43,152 --> 15:20:47,062 So let me go ahead now and\nrefine this example as follows. 16087 15:20:47,061 --> 15:20:50,351 I'm going to go into\nPhonebook.py, and I'm 16088 15:20:50,351 --> 15:20:52,811 going to import a whole\nlibrary called CSV. 16089 15:20:52,811 --> 15:20:54,671 And this is a powerful\none, because Python 16090 15:20:54,671 --> 15:20:58,391 comes with a library that just\nhandles CSV files for you. 16091 15:20:58,391 --> 15:21:02,122 A CSV file is just a file\nwith comma separated values. 16092 15:21:02,122 --> 15:21:06,102 And, in fact, to demonstrate\nthis, let me check on one thing 16093 15:21:06,101 --> 15:21:08,981 here, just to make this\na little more real. 16094 15:21:08,982 --> 15:21:15,532 To demonstrate this, let's\ngo ahead and do this. 16095 15:21:15,531 --> 15:21:18,491 Let me import the CSV library from CS50. 16096 15:21:20,351 --> 15:21:24,072 Let me then open a file,\nusing the open function 16097 15:21:24,072 --> 15:21:28,932 open a file called\nPhonebook.csv, in append format 16098 15:21:28,932 --> 15:21:31,421 in contrast with read\nformat and write format. 16099 15:21:31,421 --> 15:21:34,972 Write just blows it away if it exists,\n 16100 15:21:34,972 --> 15:21:37,452 So I keep this phone book\naround, just like you might 16101 15:21:37,451 --> 15:21:39,389 keep adding contacts to your phone. 16102 15:21:39,389 --> 15:21:41,932 Now let me go ahead and get a\ncouple of values from the user. 16103 15:21:41,932 --> 15:21:45,342 Let me say getString and\nask the user for a name. 16104 15:21:45,341 --> 15:21:50,681 Then let me getString again, and\nask the user for their number. 16105 15:21:50,682 --> 15:21:52,707 And now, let me go ahead and do this. 16106 15:21:52,707 --> 15:21:54,582 And this is new, and\nthis is Python-specific. 16107 15:21:54,582 --> 15:21:57,342 And you would only know this\nby following a tutorial 16108 15:21:57,341 --> 15:21:59,002 or reading the documentation. 16109 15:21:59,002 --> 15:22:01,391 Let me give myself a\nvariable called writer 16110 15:22:01,391 --> 15:22:06,472 and ask the CSV library\nfor a writer to that file. 16111 15:22:06,472 --> 15:22:09,912 Then, let me go ahead and\nuse that writer variable 16112 15:22:09,911 --> 15:22:13,241 use a function or a method\ninside of it, called write row 16113 15:22:13,241 --> 15:22:17,722 to write out a list containing\nthat person's name and number. 16114 15:22:17,722 --> 15:22:20,832 Notice the square brackets\ninside the parentheses 16115 15:22:20,832 --> 15:22:25,872 because I'm just printing a list\n 16116 15:22:25,872 --> 15:22:27,622 And then I'm just going\nto close the file. 16117 15:22:27,622 --> 15:22:29,264 So what is the effect of all of this? 16118 15:22:29,264 --> 15:22:31,722 Well, let me go ahead and run\nthis version of Phonebook.py 16119 15:22:31,722 --> 15:22:33,202 and I'm prompted for a name. 16120 15:22:33,201 --> 15:22:41,651 Let's do Carter's first, plus\n1-617-495-1000, and then 16121 15:22:44,292 --> 15:22:47,482 Notice in my current directory,\n 16122 15:22:47,482 --> 15:22:50,952 which I wrote, and\napparently Phonebook.csv. 16123 15:22:50,951 --> 15:22:53,351 CSV just stands for\ncomma separated values. 16124 15:22:53,351 --> 15:22:56,902 And it's like a very simple way\n 16125 15:22:56,902 --> 15:23:00,192 if you will, where the comma represents\n 16126 15:23:00,192 --> 15:23:02,891 There's only two columns\nhere, name and number. 16127 15:23:02,891 --> 15:23:06,101 But, because I'm writing to\nthis file in append mode 16128 15:23:06,101 --> 15:23:09,741 let me run it one more time,\nPython of Phonebook.py 16129 15:23:09,741 --> 15:23:18,011 and let me go ahead and do David\nand plus 1-949-468-2750, Enter. 16130 15:23:18,012 --> 15:23:19,872 And notice what happened\nin the CSV file. 16131 15:23:19,872 --> 15:23:22,902 It automatically updated,\nbecause I'm now persisting 16132 15:23:22,902 --> 15:23:25,522 this data to the file in question. 16133 15:23:25,521 --> 15:23:27,881 So if I wanted to now\nread this file in, I 16134 15:23:27,881 --> 15:23:32,201 could actually go ahead and\ndo linear search on the data 16135 15:23:32,201 --> 15:23:35,171 using a read function to\nactually read from the CSV. 16136 15:23:35,171 --> 15:23:37,871 But, for now, we'll just leave\nit a little simply as write. 16137 15:23:37,872 --> 15:23:39,792 And let me make one refinement here. 16138 15:23:39,792 --> 15:23:43,542 It turns out that, if you're in\nthe habit of re-opening a file 16139 15:23:43,542 --> 15:23:45,851 you don't have to even\nclose it explicitly. 16140 15:23:47,442 --> 15:23:52,572 You can instead say, with the opening\n 16141 15:23:52,572 --> 15:23:57,822 in append mode, calling the thing file,\n 16142 15:23:58,872 --> 15:24:00,899 So the with keyword is\na new thing in Python. 16143 15:24:00,898 --> 15:24:03,731 And it's used in a few different\n 16144 15:24:03,732 --> 15:24:04,857 is to tighten up code here. 16145 15:24:04,857 --> 15:24:06,940 And I'm going to move my\nvariables to the outside 16146 15:24:06,940 --> 15:24:09,432 because they don't need to be\ninside of the with statement 16147 15:24:10,389 --> 15:24:12,974 This just has the effect of\nensuring that you, the programmer 16148 15:24:12,974 --> 15:24:15,312 don't screw up, and accidentally\ndon't close your file. 16149 15:24:15,311 --> 15:24:17,201 In fact, you might\nrecall, from C, Valgrind 16150 15:24:17,201 --> 15:24:21,758 might have complained at you, if you had\n 16151 15:24:21,758 --> 15:24:24,341 you might have had a memory leak\nas a result. The with keyword 16152 15:24:24,341 --> 15:24:28,361 takes care of all of\nthat for you, as well. 16153 15:24:28,362 --> 15:24:31,192 How about let's do, want to do this. 16154 15:24:31,192 --> 15:24:34,482 How about, let's do one other thing. 16155 15:24:35,752 --> 15:24:38,802 Let me go ahead and propose,\nthat on your phone or laptop 16156 15:24:38,802 --> 15:24:43,992 here, or online, go to this URL here,\n 16157 15:24:43,991 --> 15:24:46,811 And just to show that these CSVs\n 16158 15:24:46,811 --> 15:24:48,371 and if you've ever\nlike used a Google Form 16159 15:24:48,372 --> 15:24:50,082 or managed a student group,\nor something where you've 16160 15:24:50,082 --> 15:24:52,272 collected data via Google\nForms, you can actually 16161 15:24:52,271 --> 15:24:55,161 export all of that data via CSV files. 16162 15:24:55,161 --> 15:24:57,671 So go ahead to this URL here. 16163 15:24:57,671 --> 15:24:59,472 And those of you\nwatching on demand later 16164 15:24:59,472 --> 15:25:01,061 will find that the form\nis no longer working 16165 15:25:01,061 --> 15:25:02,551 since we're only doing this live. 16166 15:25:02,552 --> 15:25:04,302 But that will lead to\na Google Form that's 16167 15:25:04,302 --> 15:25:07,272 going to let everyone input\ntheir answer to a question 16168 15:25:07,271 --> 15:25:10,182 like what house do you\nwant to end up into 16169 15:25:10,182 --> 15:25:13,152 sort of an approximation of the\nsorting hat in Harry Potter. 16170 15:25:13,152 --> 15:25:17,202 And via this form, will we then\nhave the ability to export 16171 15:25:20,302 --> 15:25:24,132 So let's give you a moment to do that. 16172 15:25:24,131 --> 15:25:26,981 In just a moment, I'll share\nmy version of the screen, which 16173 15:25:26,982 --> 15:25:30,852 is going to let me actually\nopen the file, the form itself. 16174 15:25:30,851 --> 15:25:35,591 And in just a moment, I'll switch over. 16175 15:25:35,591 --> 15:25:37,541 OK, so this is now my\nversion of the form 16176 15:25:37,542 --> 15:25:40,811 here, where we have 200 plus responses\n 16177 15:25:40,811 --> 15:25:44,531 house do you belong in, Gryffindor,\n 16178 15:25:44,531 --> 15:25:49,322 If I go over to responses, I'll see all\n 16179 15:25:49,322 --> 15:25:51,822 So graphical user interface,\nand we could flip through this. 16180 15:25:51,822 --> 15:25:56,531 And it looks like, interestingly,\n40% of Harvard students 16181 15:25:56,531 --> 15:26:00,745 want to be in Gryffindor, 22%\nin Slytherin, and everyone else 16182 15:26:01,661 --> 15:26:03,791 But you might have noticed,\nif ever using a Google Form 16183 15:26:03,792 --> 15:26:05,241 this Google Spreadsheets link. 16184 15:26:05,241 --> 15:26:06,531 So I'm going to go ahead and click that. 16185 15:26:06,531 --> 15:26:08,981 And that's going to automatically open,\n 16186 15:26:08,982 --> 15:26:11,812 But you can do the same thing\nwith Office 365 as well. 16187 15:26:11,811 --> 15:26:14,561 And now you see the raw\ndata as a spreadsheet. 16188 15:26:14,561 --> 15:26:19,421 But in Google Spreadsheets, if I go\n 16189 15:26:19,421 --> 15:26:23,322 notice I can download this as\nan Excel file, a PDF, and also 16190 15:26:23,322 --> 15:26:25,432 a CSV, comma separated values. 16191 15:26:25,432 --> 15:26:27,141 So let me go ahead and do that. 16192 15:26:27,141 --> 15:26:30,442 That gives me a file in my\nDownloads folder on my computer. 16193 15:26:30,442 --> 15:26:34,492 I'm going to now go back\nto my code editor here. 16194 15:26:34,491 --> 15:26:36,701 And what I'm going to go\nahead and do is upload 16195 15:26:36,701 --> 15:26:40,841 this file, from my\nDownloads folder to VS Code 16196 15:26:40,841 --> 15:26:43,131 so that we can actually\nsee it within here. 16197 15:26:43,131 --> 15:26:44,741 And now you can see this open file. 16198 15:26:44,741 --> 15:26:47,741 And I'm going to shorten its name,\n 16199 15:26:47,741 --> 15:26:52,511 I'm going to rename this using the\n 16200 15:26:52,512 --> 15:26:55,889 And then we can see, in the file, that\n 16201 15:26:55,889 --> 15:26:57,972 house, where you have a\nwhole bunch of time stamps 16202 15:26:57,972 --> 15:27:00,792 when people filled out the form,\n 16203 15:27:00,792 --> 15:27:02,502 And then everyone else\njust a moment ago. 16204 15:27:02,502 --> 15:27:05,832 And the second value, after each\n 16205 15:27:05,832 --> 15:27:08,562 Well, let me go ahead here\nand implement a program 16206 15:27:08,561 --> 15:27:12,621 in a file called Hogwarts.py,\nthat processes this data. 16207 15:27:12,622 --> 15:27:14,802 So in Hogwarts.py, let's\njust write a program 16208 15:27:14,802 --> 15:27:17,962 that now reads a CSV, in\nthis case not a phone book 16209 15:27:17,961 --> 15:27:19,932 but everyone's sorting hat information. 16210 15:27:19,932 --> 15:27:21,972 And I'm going to go\nahead and Import CSV. 16211 15:27:21,972 --> 15:27:25,182 And suppose I want to answer a\nreasonable question, ignoring 16212 15:27:25,182 --> 15:27:28,991 the fact that Google's GUI or graphical\n 16213 15:27:28,991 --> 15:27:31,841 I just want to count up who's\ngoing to be in which house. 16214 15:27:31,841 --> 15:27:36,161 So let me give myself a dictionary\n 16215 15:27:37,302 --> 15:27:39,312 And let me pre-create a few keys. 16216 15:27:39,311 --> 15:27:44,021 Let me say Gryffindor is\ngoing to be initialized to 0 16217 15:27:44,021 --> 15:27:48,341 Hufflepuff will be initialized\nto 0 as well, Ravenclaw 16218 15:27:49,722 --> 15:27:53,292 And finally, Slytherin\nwill be initialized to 0. 16219 15:27:53,292 --> 15:27:56,472 So here's another example of\na dictionary, or a hash table 16220 15:27:56,472 --> 15:27:58,662 just being a very\ngeneral-purpose piece of data. 16221 15:27:58,661 --> 15:28:00,281 You can have keys and values. 16222 15:28:00,281 --> 15:28:01,991 The keys, in this case, are the houses. 16223 15:28:01,991 --> 15:28:05,021 The values are initially zero,\nbut I'm going to use this 16224 15:28:05,021 --> 15:28:10,121 instead of like four separate variables,\n 16225 15:28:12,252 --> 15:28:19,701 With opening Hogwarts.csv, in read mode,\n 16226 15:28:19,701 --> 15:28:22,961 I just want to read it, as\nfile as my variable name. 16227 15:28:22,961 --> 15:28:26,051 Let's go ahead and create\na reader this time 16228 15:28:26,052 --> 15:28:31,232 that is using the reader function in\n 16229 15:28:31,232 --> 15:28:33,732 I'm going to go ahead and ignore\nthe first line of the file 16230 15:28:33,732 --> 15:28:36,792 because, recall, that the first\n 16231 15:28:36,792 --> 15:28:37,972 I want to get the real data. 16232 15:28:37,972 --> 15:28:40,061 So this next function\nis just a little trick 16233 15:28:40,061 --> 15:28:43,252 for ignoring the first line of the file. 16234 15:28:44,322 --> 15:28:48,701 For every other row in the\nreader, that is line by line 16235 15:28:48,701 --> 15:28:51,941 get the current person's house,\nwhich is in row bracket 1. 16236 15:28:51,942 --> 15:28:54,735 This is what the CSV reader\nlibrary is doing for us. 16237 15:28:54,735 --> 15:28:56,652 It's handling all of the\nreading of this file. 16238 15:28:56,652 --> 15:29:00,281 It figures out where the comma is,\n 16239 15:29:00,281 --> 15:29:02,771 it hands you back a list of size 2. 16240 15:29:02,771 --> 15:29:07,611 In bracket 0 is the time stamp,\nin bracket 1 is the house name. 16241 15:29:07,612 --> 15:29:11,351 So, in my code, I can say\nhouse equals row bracket 1. 16242 15:29:11,351 --> 15:29:13,491 I don't care about the time\nstamp for this program. 16243 15:29:13,491 --> 15:29:17,591 And then let's go into my dictionary\n 16244 15:29:17,591 --> 15:29:23,891 into it at the house location, by\n 16245 15:29:23,891 --> 15:29:26,802 And now, at the end\nof this block of code 16246 15:29:26,802 --> 15:29:29,562 that has the effect of iterating\nover every line of the file 16247 15:29:29,561 --> 15:29:31,991 updating my dictionary\nin four different places 16248 15:29:31,991 --> 15:29:35,711 based on whether someone typed\n 16249 15:29:36,222 --> 15:29:40,332 And notice that I'm using the name of\n 16250 15:29:40,332 --> 15:29:44,022 to essentially go up to this little\n 16251 15:29:44,021 --> 15:29:46,541 the 1 to a 2, the 2 to\na 3, instead of having 16252 15:29:46,542 --> 15:29:48,522 like four separate\nvariables, which would just 16253 15:29:48,521 --> 15:29:50,591 be much more annoying to maintain. 16254 15:29:50,591 --> 15:29:52,811 Down at the bottom, let's\njust print out the results. 16255 15:29:52,811 --> 15:29:56,141 For each house in those\nhouses, iterating over 16256 15:29:56,141 --> 15:29:58,271 the keys they're in\nby default in Python 16257 15:29:58,271 --> 15:30:01,151 let's go ahead and print\nout an f-string that says 16258 15:30:01,152 --> 15:30:05,982 the current house has the current count. 16259 15:30:05,982 --> 15:30:11,592 And count will be the result of indexing\n 16260 15:30:13,332 --> 15:30:18,461 So let's run this to summarize\nthe data, Hogwarts.py, 140 of you 16261 15:30:18,461 --> 15:30:22,722 answered Gryffindor, 54 Hufflepuff,\n 16262 15:30:22,722 --> 15:30:25,092 And that's just my now way\nof code, and this is, oh 16263 15:30:25,091 --> 15:30:28,748 my God, so much easier than C, to\n 16264 15:30:28,749 --> 15:30:32,082 And one of the reasons that Python is so\n 16265 15:30:32,082 --> 15:30:36,432 more generally, is that it's actually\n 16266 15:30:37,461 --> 15:30:38,891 And let me clean this up slightly. 16267 15:30:38,891 --> 15:30:41,682 It's a little annoying that\nI just have to know and trust 16268 15:30:41,682 --> 15:30:46,932 that the house name is in bracket\n 16269 15:30:47,961 --> 15:30:53,051 There's something called a\nDictionary Reader in the CSV library 16270 15:30:54,402 --> 15:30:58,991 Capital D, capital R, this means\n 16271 15:30:58,991 --> 15:31:01,421 because what a dictionary\nreader does is it 16272 15:31:01,421 --> 15:31:05,411 still returns to me every row from\n 16273 15:31:05,411 --> 15:31:09,081 but it doesn't just give me a list\n 16274 15:31:10,482 --> 15:31:15,522 And it uses, as the keys in that\n 16275 15:31:15,521 --> 15:31:17,981 for every row in the\nfile, which is just to say 16276 15:31:17,982 --> 15:31:20,472 it makes my code a little\nmore readable, because instead 16277 15:31:20,472 --> 15:31:23,112 of doing this little\ntrickery, bracket 1 16278 15:31:23,112 --> 15:31:26,022 I can say quote unquote "Bracket\nHouse" with a capital H 16279 15:31:26,021 --> 15:31:28,881 because it's capitalized\nin the Google Form itself. 16280 15:31:28,881 --> 15:31:31,319 So the code now is\njust minorly different 16281 15:31:31,319 --> 15:31:34,362 but it's way more resilient, especially\n 16282 15:31:34,362 --> 15:31:36,912 and I'm moving the columns around\nor doing something like that 16283 15:31:36,911 --> 15:31:38,494 where the numbers might get messed up. 16284 15:31:38,495 --> 15:31:41,781 Now I can run this on Hogwarts.py\n 16285 15:31:41,781 --> 15:31:46,481 But I now don't have to worry about\n 16286 15:31:46,482 --> 15:31:51,402 All right, any questions on\nthose capabilities there. 16287 15:31:51,402 --> 15:31:53,921 And that's a teaser of sorts,\nfor some of the manipulation 16288 15:31:56,141 --> 15:32:00,076 All right, so some final\nexamples and flair, to intrigue 16289 15:32:00,076 --> 15:32:01,451 with what you can do with Python. 16290 15:32:01,451 --> 15:32:05,231 I'm going to actually switch over\n 16291 15:32:05,232 --> 15:32:08,422 so that I can actually use\naudio a little more effectively. 16292 15:32:08,421 --> 15:32:10,451 So here's just a terminal\nwindow on Mac OS. 16293 15:32:10,451 --> 15:32:14,471 I before class have preinstalled\n 16294 15:32:14,472 --> 15:32:16,900 that won't really work\nin VS Code in the cloud 16295 15:32:16,900 --> 15:32:20,057 because they require audio that the\n 16296 15:32:20,057 --> 15:32:22,182 But I'm going to go ahead\nand write an example here 16297 15:32:22,182 --> 15:32:26,080 that involves writing a speech-based\n 16298 15:32:26,733 --> 15:32:28,691 And I'm going to go ahead\nand import a library 16299 15:32:28,692 --> 15:32:32,230 that, again, I pre-installed,\ncalled Python text to speech 16300 15:32:32,230 --> 15:32:34,781 and I'm going to go ahead\nand, per its documentation 16301 15:32:34,781 --> 15:32:39,400 give myself a speech engine, by\n 16302 15:32:40,601 --> 15:32:43,451 I'm then going to use this\nengine's save function 16303 15:32:43,451 --> 15:32:45,701 to do something fun, like Hello, world. 16304 15:32:45,701 --> 15:32:49,002 And then I'm going to go ahead and\n 16305 15:32:50,377 --> 15:32:52,002 All right, I'm going to save this file. 16306 15:32:52,002 --> 15:32:53,502 I'm not using VS Code at the moment. 16307 15:32:53,502 --> 15:32:56,591 I'm using another popular program\n 16308 15:32:56,591 --> 15:32:59,351 called Vim, which is a\ncommand line program that's 16309 15:32:59,351 --> 15:33:01,311 just in this black and white window. 16310 15:33:01,311 --> 15:33:05,370 Let me go ahead now and run\nPython of Speech.py, and-- 16311 15:33:07,266 --> 15:33:09,641 DAVID J. MALAN: All right, so\nit's a little computerized 16312 15:33:09,641 --> 15:33:12,635 but it is speech that has been\nsynthesized from this example. 16313 15:33:12,635 --> 15:33:14,802 Let's change it a little\nbit to be more interesting. 16314 15:33:14,802 --> 15:33:16,010 Let's do something like this. 16315 15:33:16,010 --> 15:33:20,472 Let's ask the user for their name,\n 16316 15:33:20,472 --> 15:33:24,372 And then, let's use the little F\n 16317 15:33:24,372 --> 15:33:26,532 but Hello to that person's name. 16318 15:33:26,531 --> 15:33:30,792 Let me save my file, run\nPython of Speech.py, Enter. 16319 15:33:33,881 --> 15:33:36,161 DAVID J. MALAN: All right,\nso we pronounce my name OK 16320 15:33:36,161 --> 15:33:38,828 might struggle with different\nnames, depending on the phonetics. 16321 15:33:38,828 --> 15:33:40,092 But that one seemed to be OK. 16322 15:33:40,091 --> 15:33:42,371 Let's do something else with\nPython, using similarly 16323 15:33:44,302 --> 15:33:49,062 Let me go into today's examples. 16324 15:33:49,061 --> 15:33:54,851 And I'm going to go into a folder\n 16325 15:33:57,311 --> 15:33:59,891 And in this folder, that\nI've written in advance 16326 15:33:59,891 --> 15:34:02,400 are a few files,\nDetect.py, Recognize.py 16327 15:34:02,400 --> 15:34:06,851 and two full of photos,\nOffice.jpeg and Toby.jpeg. 16328 15:34:06,851 --> 15:34:09,320 If you're familiar with the\nshow, here, for instance 16329 15:34:09,321 --> 15:34:11,331 is the cast photo from The Office here. 16330 15:34:12,821 --> 15:34:15,161 Suppose I want to do\nsomething very Facebook-style 16331 15:34:15,161 --> 15:34:17,381 where I want to analyze\nall of the faces 16332 15:34:17,381 --> 15:34:19,391 or detect all of the faces in there. 16333 15:34:19,391 --> 15:34:21,461 Well, let me go ahead\nand show you a program 16334 15:34:21,461 --> 15:34:24,400 I wrote in advance,\nthat's not terribly long. 16335 15:34:24,400 --> 15:34:25,900 Much of it is actually comments. 16336 15:34:25,900 --> 15:34:27,161 But let's see what I'm doing. 16337 15:34:27,161 --> 15:34:30,521 I'm importing the Pillow library,\n 16338 15:34:30,521 --> 15:34:34,002 I'm importing a library called face\n 16339 15:34:36,650 --> 15:34:39,480 According to its documentation,\nyou go into that library 16340 15:34:39,480 --> 15:34:41,281 and you call a function\ncalled load image 16341 15:34:41,281 --> 15:34:43,891 file, to load something\nlike Office.jpeg 16342 15:34:43,891 --> 15:34:46,561 and then you can use the\nline of code like this. 16343 15:34:46,561 --> 15:34:50,641 Call a function called face\nlocations, passing the images input 16344 15:34:50,641 --> 15:34:53,641 and you get back a list of\nall of the faces in the image. 16345 15:34:53,641 --> 15:34:57,271 And then down here, a for loop,\nthat iterates over all of those 16346 15:34:58,561 --> 15:35:01,320 And inside of this loop, I\njust do a bit of trickery. 16347 15:35:01,321 --> 15:35:06,102 I figure out the top, right, bottom,\n 16348 15:35:06,101 --> 15:35:08,461 And then, using these\nlines of code here 16349 15:35:08,461 --> 15:35:11,355 I'm using that image library,\nto just draw a box, essentially. 16350 15:35:11,355 --> 15:35:12,480 And the code looks cryptic. 16351 15:35:12,480 --> 15:35:14,671 Honestly, I would have to look\nthis up to write it again. 16352 15:35:14,671 --> 15:35:17,171 But per the documentation, this\njust draws a nice little box 16353 15:35:18,131 --> 15:35:24,721 So let me go ahead and zoom out here,\n 16354 15:35:24,722 --> 15:35:29,912 All right, it's analyzing, analyzing,\n 16355 15:35:30,902 --> 15:35:35,702 And here is every face that my,\nwhat, 10 lines of Python code 16356 15:35:37,932 --> 15:35:40,711 Presumably the library\nis looking for something 16357 15:35:40,711 --> 15:35:43,622 maybe without a mask, that has\ntwo eyes, a nose, and a mouth 16358 15:35:43,622 --> 15:35:45,942 in some kind of arrangement,\nsome kind of pattern. 16359 15:35:45,942 --> 15:35:48,961 So it would seem pretty reliable, at\n 16360 15:35:49,891 --> 15:35:52,182 What if we want to look\nfor someone specific 16361 15:35:52,182 --> 15:35:53,701 for instance, someone that's\nalways getting picked on. 16362 15:35:53,701 --> 15:35:55,284 Well, we could do something like this. 16363 15:35:55,285 --> 15:35:59,582 Recognize.py, which is taking two files\n 16364 15:35:59,582 --> 15:36:01,141 of one person in particular. 16365 15:36:01,141 --> 15:36:03,421 And if you're trying to\nfind Toby in a crowd 16366 15:36:03,421 --> 15:36:06,091 here I conflated the program,\nsorry, this is the version that 16367 15:36:06,091 --> 15:36:08,072 draws a box around the given face. 16368 15:36:08,072 --> 15:36:10,201 Here we have Toby as identified. 16369 15:36:10,741 --> 15:36:14,972 Because that program, Recognize.py,\n 16370 15:36:14,972 --> 15:36:19,322 but long story short, it additionally\nloads as input Toby.jpeg 16371 15:36:19,322 --> 15:36:21,932 in order to recognize\nthat specific face. 16372 15:36:21,932 --> 15:36:24,872 And that specific face is a\ncompletely different photo 16373 15:36:24,872 --> 15:36:29,492 but it looks similar enough to the\n 16374 15:36:29,491 --> 15:36:32,341 Let's do one other that's a\nlittle sensitive to microphones. 16375 15:36:32,341 --> 15:36:37,171 Let me go into, how about my listen\n 16376 15:36:38,131 --> 15:36:40,901 And let's just run Python of Listen0.py. 16377 15:36:40,902 --> 15:36:43,952 I'm going to type in like David. 16378 15:36:43,951 --> 15:36:47,041 Oh, sorry, no, I'm going to-- 16379 15:36:52,567 --> 15:36:53,942 Oh, no, that's the wrong version. 16380 15:36:53,942 --> 15:36:55,772 [CHUCKLES] OK, I looked like an idiot. 16381 15:36:58,832 --> 15:37:02,822 And if I say goodbye, I'm talking\n 16382 15:37:02,822 --> 15:37:05,112 Now it's detecting what I'm saying here. 16383 15:37:05,112 --> 15:37:08,652 So this first version of the program is\n 16384 15:37:08,652 --> 15:37:12,993 elif elif, and it's just asking\n 16385 15:37:12,993 --> 15:37:14,951 And that was my mistake\nwith the first example. 16386 15:37:14,951 --> 15:37:17,881 And then, I'm just checking,\nis Hello in the user's words? 16387 15:37:17,881 --> 15:37:19,339 Is how are you in the user's words? 16388 15:37:19,339 --> 15:37:20,673 Didn't see that, but it's there. 16389 15:37:20,673 --> 15:37:21,991 Is goodbye in the user's words? 16390 15:37:21,991 --> 15:37:25,801 Now let's do a cooler version, using a\n 16391 15:37:33,241 --> 15:37:40,691 Let's do version 2 of this, that\n 16392 15:37:43,682 --> 15:37:46,232 OK, so now it's artificial intelligence. 16393 15:37:46,232 --> 15:37:48,332 Now let's do something a\nlittle more interesting. 16394 15:37:48,332 --> 15:37:51,752 The third version of this program that\n 16395 15:37:53,402 --> 15:37:55,322 Hello, world, my name is David. 16396 15:37:59,281 --> 15:38:02,521 OK, so that time, it not\nonly analyzed what I said 16397 15:38:02,521 --> 15:38:04,451 but it plucked my name out of it. 16398 15:38:04,451 --> 15:38:07,002 Let's do two final examples. 16399 15:38:07,002 --> 15:38:09,671 This one will generate a QR code. 16400 15:38:09,671 --> 15:38:11,641 Let me go ahead and\nwrite a program called 16401 15:38:11,641 --> 15:38:15,552 QR.py, that very simply does this. 16402 15:38:15,552 --> 15:38:17,342 Let me import a library called OS. 16403 15:38:17,341 --> 15:38:19,752 Let me import a library called QR code. 16404 15:38:19,752 --> 15:38:24,521 Let me grab an image\nhere, that's QRcode.make. 16405 15:38:24,521 --> 15:38:27,961 And let me give you the URL of like a\n 16406 15:38:31,561 --> 15:38:36,362 Let me just type this,\nso I don't get it wrong. 16407 15:38:36,362 --> 15:38:41,822 OK, so if I now use this URL here,\nof a video on YouTube, making 16408 15:38:41,822 --> 15:38:44,334 sure I haven't made any\ntypos, I'm now going 16409 15:38:44,334 --> 15:38:46,292 to go ahead and do two\nlines of code in Python. 16410 15:38:46,292 --> 15:38:49,982 I'm going to first save that as\na file called QR.png, which is 16411 15:38:49,982 --> 15:38:52,012 a two dimensional barcode, a QR code. 16412 15:38:52,012 --> 15:38:53,762 And, indeed, I'm going\nto use this format. 16413 15:38:53,762 --> 15:39:00,312 And I'm going to use the OS.system\n 16414 15:39:00,311 --> 15:39:02,612 And if you'd like to take\nout your phone at this point 16415 15:39:02,612 --> 15:39:08,792 you can see the result of my barcode,\n 16416 15:39:08,792 --> 15:39:10,307 Hopefully from afar that will scan. 16417 15:39:16,671 --> 15:39:18,981 And I think that's an\nappropriate line to end on. 16418 15:40:43,362 --> 15:40:45,372 DAVID J. MALAN: This is CS50. 16419 15:40:45,372 --> 15:40:48,702 And this is week 7, the\nweek, here, of Halloween. 16420 15:40:48,701 --> 15:40:51,851 Indeed, special thanks to\nCS50's own Valerie and her mom 16421 15:40:51,851 --> 15:40:55,932 for having created this very festive\n 16422 15:40:55,932 --> 15:40:58,671 Today, we pick up where\nwe left off last time 16423 15:40:58,671 --> 15:41:00,521 which, recall, we introduced Python. 16424 15:41:00,521 --> 15:41:03,581 And that was our big transition\nfrom C, where suddenly things 16425 15:41:03,582 --> 15:41:06,192 started to look new again,\nprobably, syntactically. 16426 15:41:06,192 --> 15:41:09,732 But also, probably things\nhopefully started to feel easier. 16427 15:41:09,732 --> 15:41:13,422 Well, with that said, problem set\n 16428 15:41:14,722 --> 15:41:18,432 But hopefully you've begun to appreciate\n 16429 15:41:19,631 --> 15:41:22,301 You get more out of the box\nwith the language itself. 16430 15:41:22,302 --> 15:41:24,792 And that's going to be so\nuseful over the coming weeks 16431 15:41:24,792 --> 15:41:29,322 as we transition further to introducing\n 16432 15:41:29,322 --> 15:41:31,612 web programming next\nweek and the week after. 16433 15:41:31,612 --> 15:41:34,241 So that by term's end, and perhaps\neven for your final project 16434 15:41:34,241 --> 15:41:37,211 you really are building\nsomething from scratch 16435 15:41:37,211 --> 15:41:40,756 using all of these various\ntools somehow together. 16436 15:41:40,756 --> 15:41:42,881 So before we do that,\nthough, today, let's consider 16437 15:41:42,881 --> 15:41:47,951 what we weren't really able to\ndo last week, which was actually 16438 15:41:47,951 --> 15:41:50,831 create and store data ourselves. 16439 15:41:50,832 --> 15:41:56,052 In Python, we've played around with the\n 16440 15:41:56,052 --> 15:41:59,152 And you've been able to\nread in CSVs from disk 16441 15:41:59,152 --> 15:42:03,222 so to speak, that is, from files\n 16442 15:42:03,222 --> 15:42:06,796 But we haven't necessarily started\n 16443 15:42:06,796 --> 15:42:09,671 And that's a huge limitation, because\n 16444 15:42:09,671 --> 15:42:11,629 we've done thus far with\na couple of exceptions 16445 15:42:11,629 --> 15:42:14,891 have involved my providing input\n 16446 15:42:14,891 --> 15:42:16,849 But then nothing happens to it. 16447 15:42:16,849 --> 15:42:18,641 It disappears the moment\nthe program quits 16448 15:42:18,641 --> 15:42:20,752 because it was only\nbeing stored in memory. 16449 15:42:20,752 --> 15:42:24,432 But today, we'll start to focus all\n 16450 15:42:24,432 --> 15:42:27,442 that is, storing things\nin files and folders 16451 15:42:27,442 --> 15:42:30,281 so that you can actually\nwrite programs that remember 16452 15:42:30,281 --> 15:42:31,961 what it is the human did last time. 16453 15:42:31,961 --> 15:42:34,661 And ultimately, you can\nactually make mobile or web apps 16454 15:42:34,661 --> 15:42:37,391 that actually begin to grow, and\ngrow, and grow their data sets 16455 15:42:37,391 --> 15:42:40,991 as might happen if you get more and\n 16456 15:42:40,991 --> 15:42:44,741 To play, then, with this new capability\n 16457 15:42:44,741 --> 15:42:47,559 let's go ahead and\njust collect some data. 16458 15:42:47,559 --> 15:42:49,391 In fact, those of you\nhere in person, if you 16459 15:42:49,391 --> 15:42:52,302 want to pull up this URL\non your phone or laptop 16460 15:42:52,302 --> 15:42:54,402 that's going to lead\nyou to a Google Form. 16461 15:42:54,402 --> 15:42:59,472 And that Google Form is going to\n 16462 15:43:00,661 --> 15:43:02,411 And it's going to ask\nyou to categorize it 16463 15:43:02,411 --> 15:43:06,611 according to a genre, like comedy,\n 16464 15:43:07,802 --> 15:43:09,552 And this is useful,\nbecause if you've ever 16465 15:43:09,552 --> 15:43:12,862 used a Google Form before, or\n 16466 15:43:12,862 --> 15:43:15,612 it's a really useful mechanism at\n 16467 15:43:15,612 --> 15:43:19,342 and then ultimately, putting\nit into a spreadsheet form. 16468 15:43:19,341 --> 15:43:23,502 So this is a screenshot of\nthe form that those of you 16469 15:43:23,502 --> 15:43:26,421 here in person or tuning in on\nZoom are currently filling out. 16470 15:43:26,421 --> 15:43:27,761 It's asking only two questions. 16471 15:43:27,762 --> 15:43:29,891 What's the title of\nyour favorite TV show? 16472 15:43:29,891 --> 15:43:34,811 And what are one or more genres\ninto which your TV show falls? 16473 15:43:34,811 --> 15:43:38,201 And I'll go ahead and\npivot now to the view 16474 15:43:38,201 --> 15:43:41,008 that I'll be able to see as the\n 16475 15:43:41,008 --> 15:43:42,550 is quite simply a Google spreadsheet. 16476 15:43:42,550 --> 15:43:45,050 Google Forms has this nice\nfeature, if you've never noticed 16477 15:43:45,050 --> 15:43:47,991 that allows you to export your\ndata to a Google Spreadsheet. 16478 15:43:47,991 --> 15:43:50,472 And then from there, we\ncan actually grab the file 16479 15:43:50,472 --> 15:43:52,842 and download it to my\nown Mac or your own PC 16480 15:43:52,841 --> 15:43:55,661 so that we can actually play around\n 16481 15:43:55,661 --> 15:43:57,911 So in fact, let me go\nahead and slide over 16482 15:43:57,911 --> 15:44:01,881 to this, the live Google Spreadsheet. 16483 15:44:01,881 --> 15:44:05,831 And you'll see, probably, a whole\n 16484 15:44:06,701 --> 15:44:09,161 And if we keep scrolling, and\nscrolling, and scrolling-- 16485 15:44:10,631 --> 15:44:12,771 There we go, up to 50 plus already. 16486 15:44:12,771 --> 15:44:15,761 If you need that URL again\nhere, if you're just tuning in 16487 15:44:15,762 --> 15:44:18,192 you can go to this URL here. 16488 15:44:18,192 --> 15:44:21,102 And in just a moment,\nwe'll have a bunch of data 16489 15:44:21,101 --> 15:44:24,792 with which we can start to experiment. 16490 15:44:24,792 --> 15:44:26,412 I'll give you a moment or so there. 16491 15:44:33,760 --> 15:44:35,302 Let me hang in there a little longer. 16492 15:44:35,302 --> 15:44:36,760 OK, we've got over 100 submissions. 16493 15:44:37,732 --> 15:44:40,612 Good, even more coming in now. 16494 15:44:40,612 --> 15:44:42,232 And we can see them coming in live. 16495 15:44:42,232 --> 15:44:44,092 Here, let me switch\nback to the spreadsheet. 16496 15:44:44,091 --> 15:44:46,431 The list is growing, and\ngrowing, and growing. 16497 15:44:48,241 --> 15:44:51,831 let me give Carter a moment to\nhelp me export it in real time. 16498 15:44:51,832 --> 15:44:54,982 Carter, just give me a heads\nup when it's reasonable for me 16499 15:44:58,222 --> 15:45:00,482 All right, and I'll begin\nto do this very slowly. 16500 15:45:00,482 --> 15:45:03,062 So I'm going to go up to the File\n 16501 15:45:03,061 --> 15:45:05,451 Download-- you can download a whole\n 16502 15:45:05,451 --> 15:45:07,401 But more simply, and the one\nwe'll start to play with here 16503 15:45:08,972 --> 15:45:12,351 So CSV files we used this past\nweek, why are they useful? 16504 15:45:12,351 --> 15:45:15,531 Now that you've played with them\n 16505 15:45:15,531 --> 15:45:20,601 what's the utility of a CSV file versus\n 16506 15:45:26,199 --> 15:45:28,072 AUDIENCE: Because it's just a text file? 16507 15:45:28,072 --> 15:45:29,947 DAVID J. MALAN: OK, so\nstorage is compelling. 16508 15:45:29,947 --> 15:45:33,052 A simple text file with ASCII or\n 16509 15:45:36,525 --> 15:45:37,860 DAVID J. MALAN: Yeah, well said. 16510 15:45:37,860 --> 15:45:40,101 It's just a simple text\nformat, but using conventions 16511 15:45:40,101 --> 15:45:43,671 like commas you can represent the\n 16512 15:45:43,671 --> 15:45:45,771 backslash ends invisibly\nat the end of your lines 16513 15:45:45,771 --> 15:45:47,341 you can create the idea of rows. 16514 15:45:47,341 --> 15:45:49,341 So it's a very simple\nway of implementing what 16515 15:45:49,341 --> 15:45:51,951 we might call a flat-file database. 16516 15:45:51,951 --> 15:45:54,201 It's a way of storing\ndata in a flat, that is 16517 15:45:54,201 --> 15:45:57,651 very simple file that's just\npure ASCII or Unicode text. 16518 15:45:57,652 --> 15:46:00,682 And more compellingly, I dare\nsay, is that with a CSV file 16519 15:46:02,203 --> 15:46:04,161 Something is portable in\nthe world of computing 16520 15:46:04,161 --> 15:46:07,251 if it means you can use it on a Mac\n 16521 15:46:08,091 --> 15:46:10,822 And portability is nice because if\n 16522 15:46:10,822 --> 15:46:13,101 there'd be a whole bunch of\npeople in this room and online 16523 15:46:13,101 --> 15:46:15,531 who couldn't download it because\n 16524 15:46:16,281 --> 15:46:21,021 Or if they have a Mac, or if it's\n 16525 15:46:21,021 --> 15:46:22,951 a PC user might not be\nable to download it. 16526 15:46:22,951 --> 15:46:25,141 So a CSV is indeed very portable. 16527 15:46:25,141 --> 15:46:28,281 So I'm going to go ahead and\ndownload, quite simply, the CSV 16528 15:46:29,781 --> 15:46:32,301 That's going to put it onto\nmy own Mac's Downloads folder. 16529 15:46:32,302 --> 15:46:36,802 And let me go ahead here, and in just a\n 16530 15:46:36,802 --> 15:46:40,252 Because it actually downloads\nit at a pretty large name. 16531 15:46:40,252 --> 15:46:43,461 And give me just one moment here,\nand you'll see that, indeed 16532 15:46:43,461 --> 15:46:46,551 on my Mac I have a file\ncalled favorites.csv. 16533 15:46:46,552 --> 15:46:48,092 I shortened the name real quick. 16534 15:46:48,091 --> 15:46:54,021 And now what I'm going to do is go\n 16535 15:46:54,021 --> 15:46:55,822 I'm going to open my File Explorer. 16536 15:46:55,822 --> 15:46:59,961 And if I minimize my window here for\n 16537 15:46:59,961 --> 15:47:03,322 is that you can just drag and drop a\n 16538 15:47:03,322 --> 15:47:06,002 And voila, it's going to\nautomatically upload it for you. 16539 15:47:06,002 --> 15:47:08,601 So let me go ahead and full\nscreen here, close my Explorer 16540 15:47:08,601 --> 15:47:10,461 temporarily close my Terminal window. 16541 15:47:10,461 --> 15:47:14,061 And you'll see here a\nCSV file, favorites.csv. 16542 15:47:14,061 --> 15:47:16,731 And the first row, by\nconvention, has whatever 16543 15:47:16,732 --> 15:47:20,062 the columns were in Google\nSpreadsheets, or Office 365 16544 15:47:20,061 --> 15:47:23,961 in Excel online, timestamp,\ncomma, title, comma, genres. 16545 15:47:23,961 --> 15:47:25,731 Then, we have timestamps,\nwhich indicates 16546 15:47:25,732 --> 15:47:27,064 when people started submitting. 16547 15:47:27,063 --> 15:47:29,271 Looks like a couple of people\nwere super eager to get 16548 15:47:30,771 --> 15:47:34,682 And then, you have the\ntitle next, after a comma. 16549 15:47:34,682 --> 15:47:37,491 But there's kind of a\ncuriosity after that. 16550 15:47:37,491 --> 15:47:40,851 Sometimes I see the genre\nlike comedy, comedy, comedy 16551 15:47:40,851 --> 15:47:45,211 but sometimes it's like crime, comma,\n 16552 15:47:45,891 --> 15:47:47,811 And those things are quoted. 16553 15:47:47,811 --> 15:47:49,521 And yet, I didn't do any quotes. 16554 15:47:49,521 --> 15:47:51,141 You probably didn't type any quotes. 16555 15:47:51,141 --> 15:47:55,521 Where are those quotes\ncoming from in this CSV file? 16556 15:47:55,521 --> 15:47:56,991 Why are they there if we infer? 16557 15:48:00,682 --> 15:48:03,650 DAVID J. MALAN: Yeah, so you\nhave a corner case, if you will. 16558 15:48:03,650 --> 15:48:05,692 Because if you're using\ncommas, as you described 16559 15:48:05,692 --> 15:48:09,622 to separate your data into what\nare effectively columns, well 16560 15:48:09,622 --> 15:48:12,352 you've painted yourself into\na corner if your actual data 16561 15:48:13,919 --> 15:48:16,461 So what Google has done, what\nMicrosoft does, what Apple does 16562 15:48:16,461 --> 15:48:19,671 is, they quote any strings\nof text that themselves 16563 15:48:19,671 --> 15:48:23,902 have commas so that these are\nnow English grammatical commas 16564 15:48:26,072 --> 15:48:28,741 So it's a way of escaping\nyour data, if you will. 16565 15:48:28,741 --> 15:48:31,411 And escaping just means to call\nout a symbol in a special way 16566 15:48:31,411 --> 15:48:33,978 so it's not misinterpreted\nas something else. 16567 15:48:33,978 --> 15:48:35,811 All right, so this is\nall to say that we now 16568 15:48:35,811 --> 15:48:39,322 have all of this data with which we\n 16569 15:48:39,322 --> 15:48:41,182 start calling a flat-file database. 16570 15:48:41,182 --> 15:48:44,572 So suppose I wanted to now\nstart manipulating this data 16571 15:48:44,572 --> 15:48:47,451 and I want to store it ultimately,\nindeed, in this CSV format. 16572 15:48:47,451 --> 15:48:49,881 How can I actually\nstart to read this data 16573 15:48:49,881 --> 15:48:52,292 maybe clean it up, maybe\ndo some analytics on it 16574 15:48:52,292 --> 15:48:55,912 and actually figure out, what's the most\n 16575 15:48:55,911 --> 15:48:57,531 here over the past few minutes? 16576 15:48:57,531 --> 15:48:59,612 Well, let me go ahead and close this. 16577 15:48:59,612 --> 15:49:04,311 Let me go ahead, then, and open up,\n 16578 15:49:04,311 --> 15:49:07,311 And let's code up a file\ncalled favorites.py. 16579 15:49:07,311 --> 15:49:11,451 And let's go ahead and iteratively start\n 16580 15:49:11,451 --> 15:49:13,171 and printing out what's inside of it. 16581 15:49:13,171 --> 15:49:16,671 So you might recall that we can do\n 16582 15:49:16,671 --> 15:49:20,332 to give myself some CSV\nreading functionality. 16583 15:49:20,332 --> 15:49:24,952 Then, I can go ahead and do something\n 16584 15:49:24,951 --> 15:49:27,621 that I want to open in read mode. 16585 15:49:27,622 --> 15:49:29,332 Quote, unquote, "r" means to read it. 16586 15:49:29,332 --> 15:49:31,942 And then, I can say as\nfile, or whatever other name 16587 15:49:31,942 --> 15:49:35,012 for a variable to say that\nI want to open this file 16588 15:49:35,012 --> 15:49:37,822 and essentially store some kind of\n 16589 15:49:38,942 --> 15:49:42,262 Then, I can give myself a\nreader, and I can say csv.reader 16590 15:49:42,262 --> 15:49:43,739 passing in that file as input. 16591 15:49:43,739 --> 15:49:45,322 And this is the magic of that library. 16592 15:49:45,322 --> 15:49:48,531 It deals with the process of opening\n 16593 15:49:48,531 --> 15:49:51,771 back something that you can just\n 16594 15:49:51,771 --> 15:49:55,851 I do want to skip the first row,\nand recall that I can do this. 16595 15:49:55,851 --> 15:49:59,006 Next, reader, is this little trick\n 16596 15:49:59,006 --> 15:50:00,381 Because the first one is special. 16597 15:50:00,381 --> 15:50:02,752 It said timestamp, title, genres. 16598 15:50:02,752 --> 15:50:04,741 That's not your data, that was mine. 16599 15:50:04,741 --> 15:50:07,222 But this means now that\nI've skipped that first row. 16600 15:50:07,222 --> 15:50:10,042 Everything hereafter is going\nto be the title of a show 16601 15:50:10,042 --> 15:50:11,601 that you all like, so let me do this. 16602 15:50:11,601 --> 15:50:16,432 For row in the reader, let's go\nahead and print out the title 16603 15:50:16,432 --> 15:50:18,201 of the show each of you typed in. 16604 15:50:18,201 --> 15:50:22,341 How do I get at the title of\nthe show each of you typed in? 16605 15:50:22,341 --> 15:50:24,081 It's somewhere inside of row. 16606 15:50:26,131 --> 15:50:28,252 So what do I want to\ntype next in order to get 16607 15:50:28,252 --> 15:50:34,262 at the title of the current\nrow just as a quick check here? 16608 15:50:34,262 --> 15:50:36,692 What do I want to type to\nget at the title of the row 16609 15:50:36,692 --> 15:50:40,862 keeping in mind, again, that it\nwas timestamp, title, genres? 16610 15:50:42,237 --> 15:50:44,252 DAVID J. MALAN: So row\nbracket 1 would give me 16611 15:50:44,252 --> 15:50:47,982 the second column, 0 index, that is,\n 16612 15:50:47,982 --> 15:50:49,862 So this program isn't\nthat interesting yet 16613 15:50:49,862 --> 15:50:52,711 but it's a quick and dirty way to\n 16614 15:50:53,161 --> 15:50:55,328 Let me actually just do a\nlittle bit of a check here 16615 15:50:55,328 --> 15:50:57,721 and see if it contains\nthe data I think it does. 16616 15:50:57,722 --> 15:50:59,832 Let me maximize my Terminal window here. 16617 15:50:59,832 --> 15:51:03,152 Let me run Python of\nfavorites.py, hitting Enter. 16618 15:51:03,152 --> 15:51:07,862 And you'll see now a purely\ntextual list of all of the shows 16619 15:51:09,902 --> 15:51:12,421 But what's noteworthy about it? 16620 15:51:12,421 --> 15:51:15,301 Specific shows aside,\njudgment aside as to people's 16621 15:51:15,302 --> 15:51:19,982 TV tastes, what's interesting or\nnoteworthy about the data that 16622 15:51:19,982 --> 15:51:23,432 might create some problems for us\n 16623 15:51:23,432 --> 15:51:25,082 and figure out what's the most popular? 16624 15:51:25,082 --> 15:51:27,932 How many people like this or that? 16625 15:51:29,493 --> 15:51:32,800 AUDIENCE: User errors [INAUDIBLE]. 16626 15:51:32,800 --> 15:51:34,842 DAVID J. MALAN: Yeah,\nthere might be user errors 16627 15:51:34,841 --> 15:51:38,141 or just stylistic differences that\n 16628 15:51:42,311 --> 15:51:45,591 Let's see if I can see an\nexample on the screen here. 16629 15:51:45,591 --> 15:51:49,681 Yeah, so friends here is an all\n 16630 15:51:50,482 --> 15:51:51,712 We can sort of mitigate that. 16631 15:51:51,711 --> 15:51:54,822 But this is just a tiny example\nof where data in the real world 16632 15:51:56,099 --> 15:51:57,641 And that probably wasn't even a typo. 16633 15:51:57,641 --> 15:52:02,928 It was just someone not caring as much\n 16634 15:52:02,928 --> 15:52:05,261 Your users are going to type\nwhat they're going to type. 16635 15:52:05,262 --> 15:52:08,922 So let's see if we can't now begin\nto get at more specific data 16636 15:52:08,921 --> 15:52:10,841 and maybe even clean\nsome of this data up. 16637 15:52:10,841 --> 15:52:15,911 Let me go back into my file\ncalled favorites.py here 16638 15:52:15,911 --> 15:52:19,691 and let's actually do something a\n 16639 15:52:19,692 --> 15:52:23,412 Instead of a reader, recall that there\n 16640 15:52:23,411 --> 15:52:25,091 just a little more user friendly. 16641 15:52:25,091 --> 15:52:30,011 And it means I can type in dictionary\n 16642 15:52:30,012 --> 15:52:36,012 But now, when I iterate over this\n 16643 15:52:36,012 --> 15:52:39,402 When using a DictReader instead\nof a reader, recall, and this 16644 15:52:39,402 --> 15:52:43,031 is just a peculiarity\nof the CSV library 16645 15:52:43,031 --> 15:52:47,871 this gives me back, not a list\nof cells, but what instead 16646 15:52:47,872 --> 15:52:50,292 which is marginally more\nuser friendly for me? 16647 15:52:53,802 --> 15:52:56,202 I can now use open bracket,\nquotes, and the title. 16648 15:52:56,201 --> 15:52:59,112 Because what's coming back\nnow is a dict object, that is 16649 15:52:59,112 --> 15:53:02,351 a dictionary which has keys and values. 16650 15:53:02,351 --> 15:53:04,402 The keys of which are\nthe column headings. 16651 15:53:04,402 --> 15:53:06,741 The values of which are the\ndata I actually care about. 16652 15:53:06,741 --> 15:53:09,371 So this is just marginally\nbetter because, one, it's 16653 15:53:09,372 --> 15:53:12,672 just way more obvious to me, the\n 16654 15:53:13,391 --> 15:53:15,652 I don't remember what\ncolumn the title was. 16655 15:53:17,232 --> 15:53:18,792 That's something you're\ngoing to forget over time. 16656 15:53:18,792 --> 15:53:21,851 And God forbid someone changes the\n 16657 15:53:21,851 --> 15:53:24,731 the columns in Excel, or Apple\nNumbers, or Google Spreadsheets. 16658 15:53:24,732 --> 15:53:27,320 That's going to break all\nof your numeric indices. 16659 15:53:27,319 --> 15:53:29,112 And so a dictionary\nreader is arguably just 16660 15:53:29,112 --> 15:53:32,652 better design because it's\nmore robust against changes 16661 15:53:32,652 --> 15:53:34,302 and potential errors like that. 16662 15:53:34,302 --> 15:53:37,902 Now the effect of this change isn't\n 16663 15:53:37,902 --> 15:53:42,162 If I run Python of favorites.py,\n 16664 15:53:42,161 --> 15:53:46,631 But I've now not made any assumptions\n 16665 15:53:48,762 --> 15:53:51,672 Well, let's go ahead and now\nfilter out some duplicates. 16666 15:53:51,671 --> 15:53:55,301 Because there's a lot of commonality\n 16667 15:53:55,302 --> 15:53:58,002 see if we can't filter out duplicates. 16668 15:53:58,002 --> 15:54:04,241 If I'm reading a CSV file top to bottom,\n 16669 15:54:04,241 --> 15:54:06,991 I want to implement to\nfilter out duplicates? 16670 15:54:06,991 --> 15:54:10,241 It's not going to be quite as simple as\n 16671 15:54:10,241 --> 15:54:12,641 I'm going to have to build this. 16672 15:54:12,641 --> 15:54:15,972 But logically, if you're reading\na file from top to bottom 16673 15:54:15,972 --> 15:54:20,202 how might you go about, in\nPython or just any context 16674 15:54:20,201 --> 15:54:23,381 getting rid of duplicate values? 16675 15:54:31,951 --> 15:54:35,072 I could use a list and I could\nadd each title to the list 16676 15:54:35,072 --> 15:54:38,411 but first check if I put\nthis into the list before. 16677 15:54:38,411 --> 15:54:40,481 So let's try a little\nsomething like that. 16678 15:54:40,482 --> 15:54:43,682 Let me go ahead and create a variable\n 16679 15:54:43,682 --> 15:54:46,682 I'll call it titles, for instance,\ninitialize to an empty list 16680 15:54:46,682 --> 15:54:48,391 open bracket, close bracket. 16681 15:54:48,391 --> 15:54:53,201 And then, inside of my loop\nhere, instead of printing it out 16682 15:54:53,201 --> 15:54:54,971 let's start to make a decision. 16683 15:54:54,972 --> 15:55:04,673 So if the current row's\ntitle is in the titles list 16684 15:55:04,673 --> 15:55:05,881 I don't want to put it there. 16685 15:55:05,881 --> 15:55:08,923 And actually, let me invert the logic\n 16686 15:55:08,923 --> 15:55:13,531 So if it's not the case that\nrow bracket title is in titles 16687 15:55:13,531 --> 15:55:21,542 then, go ahead and do something like\n 16688 15:55:21,542 --> 15:55:24,512 And recall that we saw\n.append a week or so ago 16689 15:55:24,512 --> 15:55:27,042 where it just allows you to\nappend to the current list. 16690 15:55:27,042 --> 15:55:30,152 And then, what can I do at\nthe very end, after I'm all 16691 15:55:30,152 --> 15:55:31,802 done reading the whole file? 16692 15:55:31,802 --> 15:55:35,162 Why don't I go ahead and\nsay, for title in titles 16693 15:55:35,161 --> 15:55:38,072 go ahead and print\nout the current title? 16694 15:55:38,072 --> 15:55:42,042 So it's two loops now, and we can come\n 16695 15:55:42,042 --> 15:55:44,715 But let me go ahead here and\nrerun Python of favorites.py. 16696 15:55:44,714 --> 15:55:47,881 Let me increase the size of my Terminal\n 16697 15:55:53,461 --> 15:55:56,432 I don't think I'm seeing\nduplicates, although I 16698 15:55:56,432 --> 15:55:59,342 am seeing some near duplicates. 16699 15:55:59,341 --> 15:56:02,101 For instance, there's Friends again. 16700 15:56:02,101 --> 15:56:05,402 And if we keep going, and\ngoing, and going, and going 16701 15:56:06,601 --> 15:56:12,411 Oh, interesting, so that's curious\n 16702 15:56:12,411 --> 15:56:13,971 and I have this one here, too. 16703 15:56:13,972 --> 15:56:16,461 So how might we clean this up further? 16704 15:56:16,461 --> 15:56:18,735 I like your instincts, and\nit's a step closer to it. 16705 15:56:18,735 --> 15:56:20,902 What are we going to have\nto do to really filter out 16706 15:56:24,487 --> 15:56:29,386 AUDIENCE: You could set\neverything to lower [INAUDIBLE].. 16707 15:56:30,262 --> 15:56:32,012 What are the common\nmistakes to summarize? 16708 15:56:32,012 --> 15:56:34,747 We could ignore the capitalization\naltogether and maybe 16709 15:56:34,747 --> 15:56:37,372 just force everything to lowercase,\nor everything to uppercase. 16710 15:56:37,372 --> 15:56:39,262 Doesn't matter which, but\nlet's just be consistent. 16711 15:56:39,262 --> 15:56:42,137 And for those of you who might have\n 16712 15:56:42,137 --> 15:56:44,992 the spacebar at the beginning of\nyour input or even at the end 16713 15:56:46,521 --> 15:56:49,831 Stripping whitespace is a common\n 16714 15:56:49,832 --> 15:56:53,211 So let me go back into my\ncode here, and let me go ahead 16715 15:56:53,211 --> 15:56:55,882 and tweak the title a little bit. 16716 15:56:55,881 --> 15:56:58,701 Let me say that the current\ntitle inside of this loop 16717 15:56:58,701 --> 15:57:01,641 is not going to be just\nthe current row's title. 16718 15:57:01,641 --> 15:57:05,781 But let me go ahead and strip off,\n 16719 15:57:06,601 --> 15:57:09,601 If you read the documentation for the\n 16720 15:57:09,601 --> 15:57:12,572 It gets rid of whitespace to the\nleft, whitespace to the right. 16721 15:57:12,572 --> 15:57:15,652 And then, if I want to force\neverything to maybe uppercase 16722 15:57:15,652 --> 15:57:18,052 I can just uppercase the entire string. 16723 15:57:18,052 --> 15:57:21,652 And remember, what's handy about Python\n 16724 15:57:21,652 --> 15:57:24,891 calls together by just\nusing dots again and again. 16725 15:57:24,891 --> 15:57:26,841 And that just takes\nwhatever just happened 16726 15:57:26,841 --> 15:57:29,661 like the whitespace got stripped\noff, then, it additionally 16727 15:57:29,661 --> 15:57:31,621 uppercases the whole thing as well. 16728 15:57:31,622 --> 15:57:36,322 So now, I'm going to just check whether\n 16729 15:57:36,322 --> 15:57:40,101 And if not, I'm going to go\nahead and append that title 16730 15:57:40,101 --> 15:57:42,542 massaged into this different\nformat, if you will. 16731 15:57:42,542 --> 15:57:44,692 So I'm throwing away some information. 16732 15:57:44,692 --> 15:57:49,102 I'm sacrificing all of the\nnuances of your grammar and input 16733 15:57:50,362 --> 15:57:52,822 But at least I'm trying to\ncanonicalize size, that is 16734 15:57:52,822 --> 15:57:55,292 standardize what the\ndata actually looks like. 16735 15:57:55,292 --> 15:57:59,152 So let me go ahead and run Python\n 16736 15:57:59,152 --> 15:58:00,754 Oh, and this is just user error. 16737 15:58:00,754 --> 15:58:02,211 Maybe you haven't seen this before. 16738 15:58:02,211 --> 15:58:06,211 This just looks like\na mistake on my part. 16739 15:58:06,211 --> 15:58:08,542 I meant to say not even uppercase. 16740 15:58:09,544 --> 15:58:11,752 The function is called upper,\nnow that I think of it. 16741 15:58:12,171 --> 15:58:14,671 Let's go and increase the size\nof the Terminal window again. 16742 15:58:16,161 --> 15:58:20,871 And now, it's a little more overwhelming\n 16743 15:58:22,641 --> 15:58:28,641 But I don't think I'm seeing\nmultiple Friends, so to speak. 16744 15:58:28,641 --> 15:58:31,796 There's one Friends\nup here and that's it. 16745 15:58:31,796 --> 15:58:33,171 I'm back up at my prompt already. 16746 15:58:33,171 --> 15:58:35,612 So we seem now to be\nfiltering out duplicates. 16747 15:58:35,612 --> 15:58:38,923 Now, before we dive in further and\n 16748 15:58:38,923 --> 15:58:40,131 what else could we have done? 16749 15:58:40,131 --> 15:58:42,441 Well, it turns out that\nin Python 2 you often 16750 15:58:42,442 --> 15:58:44,692 do get a lot of functionality\nbuilt into the language. 16751 15:58:44,692 --> 15:58:47,655 And I'm kind of implementing\nmyself the idea of a set. 16752 15:58:47,654 --> 15:58:49,822 If you think back to\nmathematics, a set is typically 16753 15:58:49,822 --> 15:58:53,811 something with a bunch of values\n 16754 15:58:53,811 --> 15:58:56,631 Recall that Python\nalready has this for us. 16755 15:58:56,631 --> 15:58:59,542 And we saw it really briefly\nwhen I whipped up the dictionary 16756 15:58:59,542 --> 15:59:01,531 implementation a couple of weeks back. 16757 15:59:01,531 --> 15:59:06,322 So I could actually define my titles\n 16758 15:59:06,322 --> 15:59:11,421 and this would just modestly allow\n 16759 15:59:11,421 --> 15:59:14,211 that I don't have to bother\nchecking for duplicates anyway. 16760 15:59:14,211 --> 15:59:18,051 I can instead just say\nsomething like, titles.add 16761 15:59:18,052 --> 15:59:21,149 the current title, like this. 16762 15:59:21,148 --> 15:59:24,231 Marginally better design if you know\n 16763 15:59:24,232 --> 15:59:26,095 getting more functionality out of this. 16764 15:59:26,095 --> 15:59:28,012 All right, so let's clean\nthe data up further. 16765 15:59:28,012 --> 15:59:31,492 We've now gone ahead and fixed\nthe problem of case sensitivity. 16766 15:59:31,491 --> 15:59:34,281 We threw away whitespace in case\nsomeone had hit the spacebar 16767 15:59:35,332 --> 15:59:39,302 Let's go ahead now and sort these\n 16768 15:59:39,302 --> 15:59:42,622 So instead of just printing out\nthe titles in the same order 16769 15:59:42,622 --> 15:59:47,484 you all inputted them, but filtering\n 16770 15:59:47,483 --> 15:59:49,941 and use another function in\nPython you might not have seen 16771 15:59:49,942 --> 15:59:52,281 which is literally\ncalled sorted, and will 16772 15:59:52,281 --> 15:59:57,241 take care of the process of\nactually sorting titles for you. 16773 15:59:57,241 --> 15:59:59,781 Let me go ahead and increase\nthe font size of my Terminal 16774 15:59:59,781 --> 16:00:01,911 run Python of favorites.py,\nand hit Enter. 16775 16:00:01,911 --> 16:00:05,751 And now you can really see how many of\n 16776 16:00:06,561 --> 16:00:08,932 Now it's a little easier\nto wrap our minds around 16777 16:00:08,932 --> 16:00:11,991 just because it's at least\nsorted alphabetically. 16778 16:00:11,991 --> 16:00:15,561 But now you can really see some of\n 16779 16:00:17,192 --> 16:00:21,742 But a few of you decided to stylize\n 16780 16:00:21,741 --> 16:00:24,862 Brooklyn 99 is a couple\nof different ways here. 16781 16:00:24,862 --> 16:00:28,042 And I think if we keep going we'll see\n 16782 16:00:28,042 --> 16:00:31,802 did not fix by focusing on\nwhitespace and capitalization alone. 16783 16:00:31,802 --> 16:00:35,212 So already here, this is only,\nwhat, 100 plus, 200 rows. 16784 16:00:35,211 --> 16:00:38,137 Already real-world data\nstarts to get messy quickly 16785 16:00:38,137 --> 16:00:40,012 and that might not bode\nwell when we actually 16786 16:00:40,012 --> 16:00:41,992 want to keep around real\ndata from real users. 16787 16:00:41,991 --> 16:00:44,366 You can imagine an actual\nwebsite or a mobile application 16788 16:00:44,366 --> 16:00:47,002 dealing with this kind\nof thing on scale. 16789 16:00:47,002 --> 16:00:48,421 Well, let's go ahead and do this. 16790 16:00:48,421 --> 16:00:51,711 Let's actually figure out the\npopularity of these various shows 16791 16:00:51,711 --> 16:00:57,021 by now iterating over my data, and\n 16792 16:00:58,341 --> 16:01:03,231 We're going to ignore the problems\n 16793 16:01:03,232 --> 16:01:07,732 Sorry, yeah, Avatar,\nwhere there was things 16794 16:01:07,732 --> 16:01:12,412 that were different beyond just\nwhitespace and capitalization. 16795 16:01:12,411 --> 16:01:14,181 But let's go ahead and\nkeep track of, now 16796 16:01:14,182 --> 16:01:17,631 how many of you inputted\neach of these titles. 16797 16:01:18,790 --> 16:01:21,082 I'm still going to take this\napproach of iterating over 16798 16:01:21,082 --> 16:01:23,452 the CSV file from top to bottom. 16799 16:01:23,451 --> 16:01:25,701 We've used a couple of\ndata structures thus far 16800 16:01:25,701 --> 16:01:29,601 a list to keep track of titles,\n 16801 16:01:29,601 --> 16:01:32,691 But what if I now want to keep\naround a little more information? 16802 16:01:32,692 --> 16:01:38,452 For each title, I want to keep around\n 16803 16:01:39,472 --> 16:01:43,222 I'm throwing away the total\nnumber of times I see these shows. 16804 16:01:43,222 --> 16:01:45,862 How could I start to keep that around? 16805 16:01:45,862 --> 16:01:47,186 AUDIENCE: Use a dictionary. 16806 16:01:47,186 --> 16:01:49,311 DAVID J. MALAN: We could\nuse a dictionary, and how? 16807 16:01:51,476 --> 16:01:53,434 DAVID J. MALAN: Perfect,\nreally good instincts. 16808 16:01:53,434 --> 16:01:55,191 Using a dictionary,\ninsofar as it lets us 16809 16:01:55,192 --> 16:01:58,641 store keys and values, that is,\n 16810 16:01:59,192 --> 16:02:01,822 This is why a dictionary\nor hash tables more 16811 16:02:01,822 --> 16:02:04,972 generally are such a useful,\npractical data structure. 16812 16:02:04,972 --> 16:02:08,461 Because they just let you remember\n 16813 16:02:08,461 --> 16:02:11,211 So if the keys are going\nto be the titles I've seen 16814 16:02:11,211 --> 16:02:15,201 the values could be the number of\n 16815 16:02:15,201 --> 16:02:19,649 And so it's kind of like just\n 16816 16:02:19,650 --> 16:02:22,192 For instance, if I were going\nto do this on a piece of paper 16817 16:02:22,192 --> 16:02:24,682 I might just have two\ncolumns here, where 16818 16:02:24,682 --> 16:02:29,932 maybe this is the title that I've\n 16819 16:02:29,932 --> 16:02:33,592 This is, in effect, a\ndictionary in Python. 16820 16:02:33,591 --> 16:02:36,831 It's two columns, keys on the\nleft, values on the right. 16821 16:02:36,832 --> 16:02:38,961 And this, if I can implement\nin code, will actually 16822 16:02:38,961 --> 16:02:42,921 allow me to store this data, and\n 16823 16:02:42,921 --> 16:02:44,792 to figure out which is the most popular. 16824 16:02:45,722 --> 16:02:49,582 Let me go ahead and change my titles\n 16825 16:02:49,582 --> 16:02:54,502 Let's have it be a dictionary instead,\n 16826 16:02:54,502 --> 16:02:58,942 two curly braces that are empty gives\n 16827 16:03:00,201 --> 16:03:02,781 I think most of my\ncode can stay the same. 16828 16:03:02,781 --> 16:03:06,201 But down here, I don't want\nto just blindly add titles 16829 16:03:07,521 --> 16:03:10,401 I somehow need to keep\ntrack of the count. 16830 16:03:10,402 --> 16:03:14,031 And unfortunately, if I just\ndo this-- let's do titles 16831 16:03:14,031 --> 16:03:18,921 bracket, title, plus equals 1. 16832 16:03:18,921 --> 16:03:21,472 This is a reasonable\nfirst attempt at this. 16833 16:03:22,762 --> 16:03:28,072 If titles is a dictionary and I want\n 16834 16:03:28,072 --> 16:03:30,862 the syntax for that, like before,\nis titles, bracket, and then 16835 16:03:30,862 --> 16:03:34,222 the key you want to use to\nindex into the dictionary. 16836 16:03:34,222 --> 16:03:37,342 It's not a number in this case,\nit's an actual word, a title. 16837 16:03:37,341 --> 16:03:39,561 And you're just going\nto increment it by one 16838 16:03:39,561 --> 16:03:42,351 and then eventually I'll come\nback and finish my second loop 16839 16:03:42,351 --> 16:03:45,051 and do things in terms of the order. 16840 16:03:45,052 --> 16:03:48,922 But for now, let's just keep\ntrack of the total counts. 16841 16:03:48,921 --> 16:03:51,002 Let me go ahead and\nincrease my Terminal window. 16842 16:03:51,002 --> 16:03:54,932 Let me do Python of\nfavorites.py and hit Enter. 16843 16:03:55,432 --> 16:03:59,482 How I Met Your Mother is\ngiving me a key error. 16844 16:04:04,461 --> 16:04:07,671 And in fact, just to give a\nlittle bit of a breadcrumb here 16845 16:04:09,482 --> 16:04:12,262 Let me open up the CSV\nfile again real quickly. 16846 16:04:12,262 --> 16:04:15,802 And wow, we didn't even get\npast the second row in the file 16847 16:04:15,802 --> 16:04:17,552 or the first show in the file. 16848 16:04:17,552 --> 16:04:20,182 Notice that How I Met Your\nMother, somewhat lowercased 16849 16:04:20,182 --> 16:04:22,904 is the very first show in therein. 16850 16:04:22,904 --> 16:04:24,862 What's your instinct for\nwhy this is happening? 16851 16:04:24,862 --> 16:04:27,092 AUDIENCE: You don't\nhave a starting point. 16852 16:04:27,091 --> 16:04:29,008 DAVID J. MALAN: I don't\nhave a starting point. 16853 16:04:30,262 --> 16:04:35,182 I'm blindly indexing into the dictionary\n 16854 16:04:35,182 --> 16:04:37,222 that doesn't yet exist\nin the dictionary. 16855 16:04:37,222 --> 16:04:39,952 And so Python throws\nwhat's called a key error 16856 16:04:39,951 --> 16:04:42,781 because the key you're trying\nto use just doesn't exist yet. 16857 16:04:42,781 --> 16:04:46,341 So logically, how could we fix this? 16858 16:04:47,031 --> 16:04:50,451 We got half of the problem solved,\n 16859 16:04:50,451 --> 16:04:52,191 case of nothing being there. 16860 16:04:52,936 --> 16:04:54,271 AUDIENCE: Creating a counter. 16861 16:04:54,271 --> 16:04:54,961 DAVID J. MALAN: Creating a-- 16862 16:04:55,711 --> 16:04:57,582 DAVID J. MALAN: Creating\nthe counter itself. 16863 16:04:57,582 --> 16:04:59,531 So maybe I could do something like this. 16864 16:04:59,531 --> 16:05:03,722 Let me close my Terminal window\nand let me ask a question first. 16865 16:05:03,722 --> 16:05:10,322 If the current title is in the\n 16866 16:05:10,322 --> 16:05:12,781 that's going to give me a\ntrue-false answer it turns out. 16867 16:05:12,781 --> 16:05:17,582 Then, I can safely say, titles,\nbracket, title, plus equals 1. 16868 16:05:17,582 --> 16:05:22,082 And recall, this is just shorthand\n 16869 16:05:25,510 --> 16:05:28,052 That's the same thing as this\nbut it's a little more succinct 16870 16:05:30,152 --> 16:05:34,832 Else, if it's logically not the case\n 16871 16:05:34,832 --> 16:05:38,952 dictionary, then I probably want to\n 16872 16:05:38,951 --> 16:05:40,656 Feel free to just shout it out. 16873 16:05:42,156 --> 16:05:46,661 I just have to put some value there\n 16874 16:05:47,161 --> 16:05:49,711 So now that I've got this\ngoing on, let me go ahead 16875 16:05:49,711 --> 16:05:51,961 and undo my sorting temporarily. 16876 16:05:51,961 --> 16:05:54,991 And now let me go ahead and do this. 16877 16:05:54,991 --> 16:05:58,801 I can, as a quick check, let me\ngo ahead and just run the code 16878 16:05:58,802 --> 16:06:00,391 as is, Python of favorites.py. 16879 16:06:02,372 --> 16:06:05,132 It's printing correctly, no key\nerrors, but it's not sorted. 16880 16:06:05,131 --> 16:06:06,961 And I'm not seeing any of the counts. 16881 16:06:06,961 --> 16:06:09,122 Let me just quickly add\nthe counts, and there's 16882 16:06:09,122 --> 16:06:10,872 a couple of ways I could do this. 16883 16:06:10,872 --> 16:06:18,242 I could, say, print out the title, and\n 16884 16:06:18,241 --> 16:06:22,561 how about just, comma,\ntitles, bracket, title? 16885 16:06:22,561 --> 16:06:24,362 So I'm going to print\ntwo things at once 16886 16:06:24,362 --> 16:06:26,942 both the current title\nin the dictionary 16887 16:06:26,942 --> 16:06:29,641 and whatever its value\nis by indexing into it. 16888 16:06:29,641 --> 16:06:31,481 Let me increase my Terminal window. 16889 16:06:31,482 --> 16:06:35,762 Let me run Python of\nfavorites.py, Enter, and OK. 16890 16:06:39,421 --> 16:06:42,902 None of you said a whole\nlot of TV shows, it seems. 16891 16:06:42,902 --> 16:06:47,031 What's the logical error here? 16892 16:06:47,031 --> 16:06:50,801 What did I do wrong if I\nlook back at my code here? 16893 16:06:55,832 --> 16:07:00,482 To summarize, I initialized the\n 16894 16:07:00,482 --> 16:07:03,842 but I should have initialized it at\n 16895 16:07:03,841 --> 16:07:05,561 Or I should change my code a bit. 16896 16:07:05,561 --> 16:07:08,222 So for instance, if I go back\nin here, the simplest fix 16897 16:07:08,222 --> 16:07:11,732 is probably to initialize to 1,\n 16898 16:07:11,732 --> 16:07:14,552 obviously, I'm seeing this\ntitle for the very first time. 16899 16:07:14,552 --> 16:07:16,922 Or I could change my logic a little bit. 16900 16:07:16,921 --> 16:07:18,811 I could do something like this instead. 16901 16:07:18,811 --> 16:07:24,182 If the current title is not in titles,\n 16902 16:07:24,182 --> 16:07:28,201 And then I could get rid of\nthe else, and now blindly index 16903 16:07:30,241 --> 16:07:34,441 Because now, on line 11, I\ncan trust that lines 9 and 10 16904 16:07:34,442 --> 16:07:37,382 took care of the initialization\nfor me if need be. 16905 16:07:38,911 --> 16:07:42,491 This one's a little nicer, maybe\nbecause it's one line fewer. 16906 16:07:42,491 --> 16:07:45,811 But I think both approaches are\n 16907 16:07:45,811 --> 16:07:47,761 But the key thing, no\npun intended, is that we 16908 16:07:47,762 --> 16:07:52,442 have to make sure the key exists\n 16909 16:07:59,741 --> 16:08:03,031 So otherwise, everyone would have\n 16910 16:08:03,031 --> 16:08:04,531 how many people said the same thing. 16911 16:08:04,531 --> 16:08:06,578 Now the code is as it should be. 16912 16:08:06,578 --> 16:08:08,911 So let me go ahead and open\nup my Terminal window again. 16913 16:08:08,911 --> 16:08:13,051 Let me run Python of favorites.py,\n 16914 16:08:13,052 --> 16:08:14,492 Some shows weren't that popular. 16915 16:08:14,491 --> 16:08:16,171 There's just 1s and maybe 2s. 16916 16:08:16,171 --> 16:08:21,911 But I bet if we sort these things we\n 16917 16:08:23,232 --> 16:08:29,862 Well, turns out, when dealing\nwith a dictionary like this-- 16918 16:08:29,862 --> 16:08:32,502 let's go ahead and just\nsort the titles themselves. 16919 16:08:32,502 --> 16:08:37,472 So let's reintroduce the sorted function\n 16920 16:08:37,472 --> 16:08:40,077 Let me go ahead now and\nrun Python of favorites.py. 16921 16:08:40,076 --> 16:08:42,451 Now it's just a little easier\nto wrap your mind around it 16922 16:08:42,451 --> 16:08:43,909 because at least it's alphabetical. 16923 16:08:43,910 --> 16:08:47,942 But it's not sorted by\nvalue, it's sorted by key. 16924 16:08:47,942 --> 16:08:51,512 But sure enough, if we scroll\ndown, there's something down here 16925 16:08:51,512 --> 16:08:54,544 for instance, like,\nlet's see, The Office. 16926 16:08:54,544 --> 16:08:56,252 That's definitely\ngoing to be a contender 16927 16:08:56,252 --> 16:08:58,201 for most popular, 15 responses. 16928 16:08:58,201 --> 16:09:01,201 But let's see what's actually\ngoing to bubble up to the top. 16929 16:09:01,201 --> 16:09:06,211 Unfortunately, the sorted function\n 16930 16:09:09,752 --> 16:09:12,841 But it turns out, in Python,\nif you read the documentation 16931 16:09:12,841 --> 16:09:14,851 for the sorted function,\nyou can actually 16932 16:09:14,851 --> 16:09:19,921 pass in other arguments that\ntell it how to sort things. 16933 16:09:19,921 --> 16:09:22,771 For instance, if I want to\ndo things in reverse order 16934 16:09:22,771 --> 16:09:27,481 I can add a second parameter to\n 16935 16:09:28,652 --> 16:09:30,961 You literally say,\nreverse equals true, so 16936 16:09:30,961 --> 16:09:34,171 that the position of it in the\n 16937 16:09:34,171 --> 16:09:37,103 If I now rerun this after\nincreasing my Terminal window 16938 16:09:37,103 --> 16:09:39,061 you'll see now that it's\nin the opposite order. 16939 16:09:39,061 --> 16:09:41,671 Now adventure and Anne\nwith an E is at the bottom 16940 16:09:41,671 --> 16:09:43,752 of the output instead of the top. 16941 16:09:43,752 --> 16:09:52,186 How can I tell it to sort\nby values instead of by key? 16942 16:09:52,186 --> 16:09:53,561 Well, let's go ahead and do this. 16943 16:09:53,561 --> 16:09:56,281 Let me go ahead and define a function. 16944 16:09:56,281 --> 16:09:58,411 I'm just going to call it\nf to keep things simple. 16945 16:09:58,411 --> 16:10:01,531 And this f function is going\nto take a title as input. 16946 16:10:01,531 --> 16:10:06,481 And given a given title, it's going\n 16947 16:10:06,482 --> 16:10:09,902 So actually, maybe a better name\nfor this would be get value 16948 16:10:09,902 --> 16:10:12,162 and/or we could come up\nwith something else as well. 16949 16:10:12,161 --> 16:10:14,641 The purpose of the get\nvalue function, to be clear 16950 16:10:14,641 --> 16:10:19,542 is to take it as input a title and\n 16951 16:10:20,802 --> 16:10:23,102 Well, it turns out that the\nsorted function in Python 16952 16:10:23,101 --> 16:10:27,211 according to its documentation,\nalso takes a key parameter 16953 16:10:27,211 --> 16:10:31,201 where you can pass in, crazy\nenough, the name of a function 16954 16:10:31,201 --> 16:10:36,991 that it will use in order to determine\n 16955 16:10:36,991 --> 16:10:41,891 or by the value, or in other cases,\n 16956 16:10:41,891 --> 16:10:44,731 So there's a curiosity here,\nthough, that's very deliberate. 16957 16:10:44,732 --> 16:10:46,892 Key is the name of the\nparameter, just like reverse 16958 16:10:46,891 --> 16:10:48,434 was the name of this other parameter. 16959 16:10:48,434 --> 16:10:51,451 The value of it, though,\nis not a function call. 16960 16:10:52,921 --> 16:10:55,862 Notice I am not doing\nthis, no parentheses. 16961 16:10:55,862 --> 16:11:00,741 I'm instead passing in get value,\n 16962 16:11:00,741 --> 16:11:03,241 And this is a feature of Python\nand certain other languages. 16963 16:11:03,241 --> 16:11:06,451 Just like variables, you can\nactually pass whole functions 16964 16:11:06,451 --> 16:11:10,901 around so that they can be called\n 16965 16:11:10,902 --> 16:11:14,222 So what this means is that the\n 16966 16:11:14,222 --> 16:11:16,752 they didn't know what you're\ngoing to want to sort by today. 16967 16:11:16,752 --> 16:11:21,152 But if you provide them with a function\n 16968 16:11:21,152 --> 16:11:23,342 their sorted function\nwill use that function 16969 16:11:23,341 --> 16:11:27,181 to determine, OK, if you don't want to\n 16970 16:11:28,292 --> 16:11:31,141 This is going to tell\nit to sort by the value 16971 16:11:31,141 --> 16:11:34,091 by returning the specific\nvalue we care about. 16972 16:11:34,091 --> 16:11:37,921 So let me go ahead now and rerun this\n 16973 16:11:39,961 --> 16:11:42,451 Here we have now an example\nof all of the titles you all 16974 16:11:42,451 --> 16:11:47,521 typed in, albeit forced to uppercase\n 16975 16:11:47,521 --> 16:11:50,074 And now, The Office is\nan easy win over Friends 16976 16:11:50,074 --> 16:11:52,741 versus Community, versus Game of\nThrones, Breaking Bad, and then 16977 16:11:52,741 --> 16:11:55,081 a lot of variants thereafter. 16978 16:11:55,082 --> 16:11:57,022 So there's a lot of steps to go through. 16979 16:11:57,021 --> 16:11:58,896 This isn't that bad once\nyou've done it once 16980 16:11:58,896 --> 16:12:00,813 and you know what these\nfunctions are, and you 16981 16:12:00,813 --> 16:12:02,211 know that these parameters exist. 16982 16:12:03,351 --> 16:12:07,432 That's 17 lines of code\njust to analyze a CSV file 16983 16:12:07,432 --> 16:12:10,762 that you all created by way of\nthose Google Form submissions. 16984 16:12:10,762 --> 16:12:13,702 But it took me a lot of work just\n 16985 16:12:13,701 --> 16:12:15,618 And indeed, that's going\nto be among the goals 16986 16:12:15,618 --> 16:12:18,261 for today, ultimately, is, how\ncan we just make this easier? 16987 16:12:18,262 --> 16:12:20,137 It's one thing to learn\nnew things in Python 16988 16:12:20,137 --> 16:12:22,641 but if we can avoid writing\ncode, or this much code 16989 16:12:22,641 --> 16:12:24,182 that's going to be a good thing. 16990 16:12:24,182 --> 16:12:26,362 And so one other technique\nwe can introduce here 16991 16:12:26,362 --> 16:12:28,912 that does allow us to\nwrite a little less code 16992 16:12:28,911 --> 16:12:31,072 is, we can actually get\nrid of this function. 16993 16:12:31,072 --> 16:12:34,582 It turns out, in Python, if you\njust need to make a function 16994 16:12:34,582 --> 16:12:37,281 but it's going to be used and\nthen essentially thrown away 16995 16:12:37,281 --> 16:12:40,131 it's not something you're going\n 16996 16:12:40,131 --> 16:12:42,891 it's not like a library function\nthat you want to keep around-- 16997 16:12:42,891 --> 16:12:45,021 you can actually just do this. 16998 16:12:45,021 --> 16:12:48,771 You can change the value\nof this key parameter 16999 16:12:48,771 --> 16:12:51,291 to be what's called a\nlambda function, which 17000 16:12:51,292 --> 16:12:54,381 is a fancy way of saying a function\n 17001 16:12:57,741 --> 16:13:00,871 Well, it's kind of stupid that\nI invented this name on line 13. 17002 16:13:00,872 --> 16:13:04,012 I used it on line 16, and\nthen I never again used it. 17003 16:13:04,012 --> 16:13:07,862 If there's only being used in one place,\n 17004 16:13:07,862 --> 16:13:10,342 So if you instead, in\nPython, say lambda 17005 16:13:10,341 --> 16:13:13,191 and then type out the\nname of the parameter 17006 16:13:13,192 --> 16:13:15,472 you want this anonymous\nfunction to take 17007 16:13:15,472 --> 16:13:19,802 you can then say, go ahead\nand return this value. 17008 16:13:19,802 --> 16:13:22,372 Now let's notice the\ninconsistencies here. 17009 16:13:22,372 --> 16:13:25,192 When you use this special lambda\nkeyword that says, hey Python 17010 16:13:25,192 --> 16:13:28,192 give me an anonymous function,\na function with no name 17011 16:13:28,192 --> 16:13:31,952 it then says, Python, this anonymous\n 17012 16:13:31,951 --> 16:13:34,040 Notice there's no parentheses. 17013 16:13:34,040 --> 16:13:35,841 And that's deliberate, if confusing. 17014 16:13:35,841 --> 16:13:38,250 It just tightens things up a little bit. 17015 16:13:38,250 --> 16:13:42,111 Notice that there's no return keyword,\n 17016 16:13:42,112 --> 16:13:44,122 up a bit, albeit inconsistently. 17017 16:13:44,122 --> 16:13:47,992 But this line of code\nI've just highlighted 17018 16:13:47,991 --> 16:13:51,771 is actually identical in\nfunctionality to this. 17019 16:13:51,771 --> 16:13:53,662 But it throws away the word [INAUDIBLE]. 17020 16:13:53,661 --> 16:13:55,191 It throws away the word get value. 17021 16:13:55,192 --> 16:13:58,852 It throws away the parentheses, and\n 17022 16:14:00,021 --> 16:14:02,631 And it's well suited\nfor a problem like this 17023 16:14:02,631 --> 16:14:05,301 where I just want to pass in\na tiny little function that 17024 16:14:06,432 --> 16:14:08,182 But it's not something\nI'm going to reuse. 17025 16:14:08,182 --> 16:14:10,442 It doesn't need multiple\nlines to take up space. 17026 16:14:10,442 --> 16:14:12,501 It's just a nice, elegant one liner. 17027 16:14:12,500 --> 16:14:14,181 That's all a lambda function does. 17028 16:14:14,182 --> 16:14:17,251 It allows you to create an anonymous\n 17029 16:14:17,250 --> 16:14:22,551 And then the function you're passing it\n 17030 16:14:22,552 --> 16:14:26,152 Indeed, if I run Python of favorites.py\n 17031 16:14:26,152 --> 16:14:28,141 the result is exactly the same. 17032 16:14:28,141 --> 16:14:31,701 And we see at the bottom here\nall of those small results. 17033 16:14:31,701 --> 16:14:36,151 Are any questions, then, on\nthis syntax, on these ideas? 17034 16:14:36,152 --> 16:14:39,112 The goal here has been to write\n 17035 16:14:39,112 --> 16:14:44,201 to analyze or clean up data like this. 17036 16:14:48,614 --> 16:14:51,781 DAVID J. MALAN: Could you use the lambda\n 17037 16:14:51,781 --> 16:14:54,601 It's really meant for one\nline of code, generally. 17038 16:14:54,601 --> 16:14:56,762 So you don't use the return keyword. 17039 16:14:56,762 --> 16:14:59,222 You just say what it\nis you want to return. 17040 16:15:03,021 --> 16:15:04,271 DAVID J. MALAN: Good question. 17041 16:15:04,271 --> 16:15:06,301 Could you do more in\nthat one line if it's 17042 16:15:06,302 --> 16:15:08,012 got to be a more involved algorithm? 17043 16:15:08,012 --> 16:15:11,162 Yes, but you would just ultimately\nreturn the value in question. 17044 16:15:11,161 --> 16:15:13,288 In short, if it's getting\nat all sophisticated 17045 16:15:13,288 --> 16:15:15,121 you don't use the lambda\nfunction in Python. 17046 16:15:15,122 --> 16:15:17,852 You go ahead and actually\njust define a name for it 17047 16:15:17,851 --> 16:15:19,574 even if it's a one-off name. 17048 16:15:19,574 --> 16:15:21,991 JavaScript, another language\nwe'll look at in a few weeks 17049 16:15:21,991 --> 16:15:24,932 makes heavier use, I dare\nsay, of lambda functions. 17050 16:15:24,932 --> 16:15:27,372 And those can actually be\nmultiple, multiple lines 17051 16:15:27,372 --> 16:15:30,862 but Python does not\nsupport that instinct. 17052 16:15:31,362 --> 16:15:33,069 So let's go ahead and\ndo one other thing. 17053 16:15:33,069 --> 16:15:35,682 Office was clearly popping out\nof the code here quite a bit. 17054 16:15:35,682 --> 16:15:38,101 Let's go ahead and write a\nslightly different program 17055 16:15:38,101 --> 16:15:40,741 that maybe just focuses on\nThe Office for the moment 17056 16:15:42,241 --> 16:15:46,591 So let me go ahead and throw most of\n 17057 16:15:46,591 --> 16:15:48,421 when I'm inside of my inner loop. 17058 16:15:48,421 --> 16:15:51,391 And let me go ahead, and I don't\n 17059 16:15:51,391 --> 16:15:53,582 All I want to do is focus\non the current title. 17060 16:15:53,582 --> 16:15:56,072 How could I detect if\nsomeone likes The Office? 17061 16:15:56,072 --> 16:15:59,131 Well, I could say something like-- 17062 16:16:01,652 --> 16:16:03,692 We'll just focus on The Office. 17063 16:16:03,692 --> 16:16:09,272 If title equals, equals The Office,\n 17064 16:16:13,741 --> 16:16:15,199 There's no dictionary involved now. 17065 16:16:15,199 --> 16:16:17,221 It's just a simple integer variable. 17066 16:16:17,222 --> 16:16:21,092 And then, down here\nI'll say something like 17067 16:16:21,091 --> 16:16:26,311 number of people who like The\nOffice is, whatever this value is. 17068 16:16:26,311 --> 16:16:29,191 And I'll put in counter in\ncurly braces, and then I'll 17069 16:16:29,192 --> 16:16:31,125 turn this whole thing into an F string. 17070 16:16:31,125 --> 16:16:32,792 All right, let me go ahead and run this. 17071 16:16:32,792 --> 16:16:35,442 Python of favorites.py, Enter. 17072 16:16:35,442 --> 16:16:37,952 Number of people who\nlike The Office is 15. 17073 16:16:39,332 --> 16:16:42,872 But let's go ahead now and\ndeliberately muddy the data a bit. 17074 16:16:42,872 --> 16:16:46,502 All of you were very nice in\nthat you typed in The Office. 17075 16:16:46,502 --> 16:16:48,572 But you can imagine\nsomeone just typing Office 17076 16:16:48,572 --> 16:16:51,033 for instance, maybe there, maybe there. 17077 16:16:51,033 --> 16:16:53,491 And many people might just\nwrite Office, you could imagine. 17078 16:16:53,491 --> 16:16:55,741 Didn't happen here, but\nsuppose it did, and probably 17079 16:16:55,741 --> 16:16:58,631 would have if we had even more\nand more submissions over time. 17080 16:16:58,631 --> 16:17:02,341 Now let's go ahead and rerun this\n 17081 16:17:02,341 --> 16:17:04,471 Now only 13 people like The Office. 17082 16:17:05,491 --> 16:17:11,131 The data is now as I mutated it to have\n 17083 16:17:11,131 --> 16:17:16,391 How could I change my Python code to\n 17084 16:17:16,391 --> 16:17:20,731 What could I change up here in\norder to improve this situation? 17085 16:17:24,091 --> 16:17:27,641 AUDIENCE: You write\nthe title [INAUDIBLE].. 17086 16:17:27,641 --> 16:17:30,391 DAVID J. MALAN: Yeah, so I could\n 17087 16:17:30,391 --> 16:17:34,711 If title equals The Office,\nor title equals, equals just 17088 16:17:35,779 --> 16:17:38,072 And I'm still don't have to\nworry about capitalization. 17089 16:17:38,072 --> 16:17:41,154 I don't have to worry about spaces\n 17090 16:17:41,154 --> 16:17:43,542 Now I can go ahead and rerun this code. 17091 16:17:43,542 --> 16:17:45,332 Let me go run it a third time. 17092 16:17:50,552 --> 16:17:54,002 You could imagine this\nnot scaling very well. 17093 16:17:54,002 --> 16:17:57,105 Avatar had three different\n 17094 16:17:57,105 --> 16:17:59,522 if we dug deeper that there\nmight have been more variants. 17095 16:17:59,521 --> 16:18:01,771 Could we do something a\nlittle more general purpose? 17096 16:18:01,771 --> 16:18:03,572 Well, we could do something like this. 17097 16:18:07,224 --> 16:18:09,391 this is kind of a cool thing\nyou can do with Python. 17098 16:18:09,391 --> 16:18:12,271 It's very English-like, just ask\nthe question, albeit tersely. 17099 16:18:12,271 --> 16:18:16,002 This, interesting, just\ngot me into trouble. 17100 16:18:16,002 --> 16:18:18,482 Now, all of a sudden, we're up to 16. 17101 16:18:18,482 --> 16:18:21,594 Does anyone know what the other one is? 17102 16:18:21,593 --> 16:18:23,439 AUDIENCE: Someone put V Office. 17103 16:18:23,440 --> 16:18:24,607 DAVID J. MALAN: What Office? 17104 16:18:24,607 --> 16:18:27,462 AUDIENCE: Someone entered\na V Office, [INAUDIBLE].. 17105 16:18:29,858 --> 16:18:31,191 DAVID J. MALAN: Oh, interesting. 17106 16:18:42,472 --> 16:18:44,811 OK, this one's actually going\nto be hard to correct for. 17107 16:18:44,811 --> 16:18:46,641 I can't really think of a general-- 17108 16:18:46,641 --> 16:18:51,201 well, this is actually a good\nexample of data gets messy fast. 17109 16:18:51,201 --> 16:18:53,421 And you could imagine doing\nsomething where, OK, we 17110 16:18:53,421 --> 16:18:58,261 could have like 26 conditions if someone\n 17111 16:18:58,762 --> 16:18:59,992 You could imagine doing that. 17112 16:18:59,991 --> 16:19:02,741 But then there's surely going to\n 17113 16:19:02,741 --> 16:19:04,951 So that's actually a hard one to fix. 17114 16:19:04,951 --> 16:19:10,072 But it turns out we got lucky and now\n 17115 16:19:10,072 --> 16:19:12,002 But the data is itself messy. 17116 16:19:12,002 --> 16:19:15,292 Let me show another way that just\n 17117 16:19:15,292 --> 16:19:20,092 It turns out that there's this feature\n 17118 16:19:20,091 --> 16:19:22,381 among them, called regular expressions. 17119 16:19:22,381 --> 16:19:24,381 And this is actually a\nreally powerful technique 17120 16:19:24,381 --> 16:19:26,214 that we'll just scratch\nthe surface of here. 17121 16:19:26,214 --> 16:19:29,631 But it's going to be really useful,\n 17122 16:19:29,631 --> 16:19:34,221 in web programming, any time you want\n 17123 16:19:34,222 --> 16:19:37,252 And actually, just to make\nthis clear, give me a moment 17124 16:19:37,252 --> 16:19:39,502 before I switch screens here. 17125 16:19:39,502 --> 16:19:43,792 And let me open up a\nGoogle Form from scratch. 17126 16:19:43,792 --> 16:19:47,572 Give me just a moment to\ncreate something real quick. 17127 16:19:47,572 --> 16:19:50,601 If you've never noticed this\nbefore when creating a Google Form 17128 16:19:53,752 --> 16:19:55,701 And if you want the user\nto type in something 17129 16:19:55,701 --> 16:19:58,375 very specific as a short\ntext answer like this 17130 16:19:58,375 --> 16:20:01,042 you might know that there's toggles\nlike this in Google's world 17131 16:20:02,271 --> 16:20:04,611 Or you can do response validation. 17132 16:20:04,612 --> 16:20:07,012 You could say, what's your email? 17133 16:20:07,012 --> 16:20:12,592 And then you could say something\nlike, text is an email. 17134 16:20:12,591 --> 16:20:17,871 So here's an example in Google Forms\n 17135 16:20:17,872 --> 16:20:22,492 But a feature most of you have probably\n 17136 16:20:22,491 --> 16:20:24,831 is this thing called a\nregular expression, where 17137 16:20:24,832 --> 16:20:26,781 you can actually define a pattern. 17138 16:20:26,781 --> 16:20:30,171 And I could actually reimplement that\n 17139 16:20:30,171 --> 16:20:36,411 I can say, let the user type in anything\n 17140 16:20:36,411 --> 16:20:41,941 then something else, then a\nliteral period, then, for instance 17141 16:20:43,021 --> 16:20:45,291 So it's very cryptic,\nadmittedly, at first glance. 17142 16:20:45,292 --> 16:20:48,772 But this means any\ncharacter 0 more times. 17143 16:20:48,771 --> 16:20:51,502 This means any character 0 more times. 17144 16:20:51,502 --> 16:20:54,067 This means a literal\nperiod, because apparently 17145 16:20:54,067 --> 16:20:57,502 dot means any character in\nthe context of these patterns. 17146 16:20:57,502 --> 16:21:01,412 Then this thing means any\ncharacter 0 more times. 17147 16:21:01,411 --> 16:21:04,011 So I should actually be\na little more nitpicky. 17148 16:21:04,012 --> 16:21:06,872 You don't want 0 or more times,\nyou want 1 or more times. 17149 16:21:06,872 --> 16:21:10,682 So this with the plus means\nany character 1 or more time. 17150 16:21:10,682 --> 16:21:12,362 So there has to be something there. 17151 16:21:12,362 --> 16:21:16,972 And I think I want the same thing\n 17152 16:21:16,972 --> 16:21:21,502 Or heck, if I want to restrict this\n 17153 16:21:21,502 --> 16:21:24,862 I could change that last\nthing to literally .edu. 17154 16:21:24,862 --> 16:21:26,912 And so long story short,\neven though this looks 17155 16:21:26,911 --> 16:21:31,761 I'm sure, pretty cryptic, there's\n 17156 16:21:31,762 --> 16:21:35,242 and JavaScript, and Java, and other\n 17157 16:21:35,241 --> 16:21:37,771 patterns in a standardized way. 17158 16:21:37,771 --> 16:21:41,271 And this pattern is actually something\n 17159 16:21:41,271 --> 16:21:43,491 And let me switch back to\nPython for a second just 17160 16:21:43,491 --> 16:21:45,261 to do the same kind of idea. 17161 16:21:45,262 --> 16:21:48,292 Let me toggle back to my code here. 17162 16:21:48,292 --> 16:21:52,072 Let me put up, for instance, a\nsummary of what it is you can do. 17163 16:21:52,072 --> 16:21:58,372 And here's just a quick summary\n 17164 16:21:58,372 --> 16:22:04,672 A period may represent any character.\n 17165 16:22:05,311 --> 16:22:08,362 So the dot means anything,\nso it can be A or nothing. 17166 16:22:09,502 --> 16:22:14,872 It can be A, B, A, B, C. It can be any\n 17167 16:22:14,872 --> 16:22:18,202 Change that to a plus and you now\n 17168 16:22:18,201 --> 16:22:21,201 Question mark means\nsomething is optional. 17169 16:22:21,201 --> 16:22:24,711 Caret symbol means start matching at\n 17170 16:22:24,711 --> 16:22:30,442 Dollar sign means stop matching\nat the end of the user's input. 17171 16:22:30,442 --> 16:22:32,552 So we won't play with\nall of these just now. 17172 16:22:32,552 --> 16:22:36,812 But let me go over here and\nactually tackle this Office problem. 17173 16:22:36,811 --> 16:22:40,792 Let me go ahead and import a new library\n 17174 16:22:42,622 --> 16:22:45,902 And then, down here, let me say this. 17175 16:22:50,421 --> 16:22:55,551 Let's just search for Office, quote,\n 17176 16:22:55,552 --> 16:22:58,072 Then we're going to go ahead\nand increase the counter. 17177 16:22:58,072 --> 16:23:00,381 So it turns out that the\nregular expression library 17178 16:23:00,381 --> 16:23:04,432 has a function called search that\n 17179 16:23:04,432 --> 16:23:07,311 and then, as its second\nargument the string you 17180 16:23:07,311 --> 16:23:09,502 want to analyze for that pattern. 17181 16:23:09,502 --> 16:23:13,222 So it's sort of looking for a needle\n 17182 16:23:13,222 --> 16:23:17,421 Let me go ahead now and run this\nversion of the program, Enter. 17183 16:23:17,421 --> 16:23:21,591 And now I screwed up because I forgot\n 17184 16:23:24,491 --> 16:23:27,141 Number of people who\nlike The Office is now 0. 17185 16:23:28,311 --> 16:23:30,981 thank you-- big step backwards. 17186 16:23:36,951 --> 16:23:39,936 I forced all my input to uppercase,\n 17187 16:23:39,936 --> 16:23:41,811 So we'll come back to\nother approaches there. 17188 16:23:42,771 --> 16:23:45,141 OK, now we're back up to 16. 17189 16:23:45,141 --> 16:23:47,542 But I could even, let's say-- 17190 16:23:47,542 --> 16:23:50,452 I could tolerate just The Office. 17191 16:23:50,451 --> 16:23:55,461 How about this, or how about\nsomething like, or The Office? 17192 16:23:57,714 --> 16:23:59,631 And let me use these\nother special characters. 17193 16:23:59,631 --> 16:24:02,721 This caret sign means the\nbeginning of the string. 17194 16:24:02,722 --> 16:24:06,082 This dollar sign weirdly\nrepresents the end of the string. 17195 16:24:06,082 --> 16:24:09,832 I'm adding in some parentheses just\n 17196 16:24:11,811 --> 16:24:15,921 And this is saying start matching\n 17197 16:24:15,921 --> 16:24:20,002 Check if the beginning of the string is\n 17198 16:24:21,262 --> 16:24:23,882 And then, you better be\nat the end of the string. 17199 16:24:23,881 --> 16:24:26,991 So they can't keep typing words\nbefore or after that input. 17200 16:24:26,991 --> 16:24:29,031 Let me go ahead and rerun the program. 17201 16:24:29,031 --> 16:24:32,841 And now we're down to 15, which\nused to be our correct answer 17202 16:24:32,841 --> 16:24:36,111 but then we noticed The V Office. 17203 16:24:38,021 --> 16:24:41,541 It's going to be messier\nto deal with that. 17204 16:24:41,542 --> 16:24:46,897 How about if I tolerate any\ncharacter represented by dot 17205 16:24:48,982 --> 16:24:53,452 Now if I rerun it, now I really\nhave this expressive capability. 17206 16:24:53,451 --> 16:24:57,682 So this is only to say, there are so\n 17207 16:24:58,732 --> 16:25:01,292 And some of these tools are\nmore sophisticated than others. 17208 16:25:01,292 --> 16:25:04,298 This is one that you've actually\n 17209 16:25:04,298 --> 16:25:06,381 in the context of Google\nForms for years if you're 17210 16:25:06,381 --> 16:25:09,112 in the habit of creating these for\n 17211 16:25:09,112 --> 16:25:11,182 But it's now something\nyou can start to leverage. 17212 16:25:11,182 --> 16:25:14,781 And we're just scratching the surface\n 17213 16:25:14,781 --> 16:25:18,981 But let's now do one final example\n 17214 16:25:18,982 --> 16:25:20,872 And let's actually\nwrite a program that's 17215 16:25:20,872 --> 16:25:25,382 a little more general purpose that\n 17216 16:25:25,381 --> 16:25:27,112 and figure out its popularity. 17217 16:25:27,112 --> 16:25:29,752 So let me go ahead and simplify this. 17218 16:25:29,752 --> 16:25:31,972 Let's get rid of our\nregular expressions. 17219 16:25:31,972 --> 16:25:35,281 Let's go ahead and continue\ncapitalizing the title. 17220 16:25:36,921 --> 16:25:41,362 at the beginning of this program,\n 17221 16:25:42,722 --> 16:25:45,662 So title equals, let's\nask the user for input 17222 16:25:45,661 --> 16:25:48,831 which is essentially the same thing\n 17223 16:25:50,302 --> 16:25:53,512 And then whatever they type in,\n 17224 16:25:53,512 --> 16:25:56,242 and uppercase the thing again. 17225 16:25:56,241 --> 16:26:01,161 And now, inside of my loop, I\ncould say something like this. 17226 16:26:01,161 --> 16:26:08,001 If the current row's title after\n 17227 16:26:08,002 --> 16:26:12,262 it to uppercase, too, equals\nthe user's title, then, go ahead 17228 16:26:12,262 --> 16:26:14,781 and maybe increment a counter. 17229 16:26:14,781 --> 16:26:16,502 So I still need that counter back. 17230 16:26:16,502 --> 16:26:21,951 So let me go ahead and define this\n 17231 16:26:21,951 --> 16:26:24,061 And then, at the very\nend of this program 17232 16:26:24,061 --> 16:26:26,391 let me go ahead and print\nout just the popularity 17233 16:26:26,391 --> 16:26:28,381 of whatever the human typed in. 17234 16:26:28,381 --> 16:26:31,371 So again, the only difference is\n 17235 16:26:32,061 --> 16:26:34,491 I'm initializing my\ncounter to 0, then I'm 17236 16:26:34,491 --> 16:26:38,002 searching for their\ntitle in the CSV file 17237 16:26:38,002 --> 16:26:41,152 by doing the same massaging of the\n 17238 16:26:41,152 --> 16:26:43,912 and getting rid of the whitespace. 17239 16:26:43,911 --> 16:26:47,121 So now, when I run Python\nof favorites.py, Enter 17240 16:26:47,122 --> 16:26:55,372 I could type in the office all lowercase\n 17241 16:27:02,042 --> 16:27:05,982 Because I'm the one that went in and\n 17242 16:27:05,982 --> 16:27:08,372 If we fixed those, we\nwould be back up to 15. 17243 16:27:08,372 --> 16:27:12,992 If we added support for The V\n 17244 16:27:12,991 --> 16:27:15,691 All right, any questions then\non these various manipulations? 17245 16:27:15,692 --> 16:27:17,525 And if you're feeling\nlike, oh, my god, this 17246 16:27:17,525 --> 16:27:20,442 is so much Python code just to do\n 17247 16:27:20,442 --> 16:27:22,502 And indeed, even though\nit's a powerful language 17248 16:27:22,502 --> 16:27:26,012 and can solve these kinds of problems,\n 17249 16:27:26,012 --> 16:27:28,812 just to ask a single question like this. 17250 16:27:28,811 --> 16:27:32,461 But any questions on how we did\n 17251 16:27:38,641 --> 16:27:40,141 Let's take a five-minute break here. 17252 16:27:40,141 --> 16:27:42,572 When we come back, we'll do it better. 17253 16:27:43,822 --> 16:27:45,951 And the rest of today\nis ultimately about, how 17254 16:27:45,951 --> 16:27:50,182 can we store, and manipulate,\nand change, and retrieve data 17255 16:27:50,182 --> 16:27:53,432 more efficiently than we might\nby just writing raw code? 17256 16:27:53,432 --> 16:27:56,781 This isn't to say that you shouldn't\n 17257 16:27:57,622 --> 16:28:02,362 And in fact, it might be super common\n 17258 16:28:02,362 --> 16:28:04,415 from users that you might\nwant to clean it up. 17259 16:28:04,415 --> 16:28:07,582 And maybe the best way to do that is\n 17260 16:28:07,582 --> 16:28:09,711 you can make all of the\nrequisite changes and fixes 17261 16:28:09,711 --> 16:28:12,864 like we did with The Office,\nfor instance, again and again 17262 16:28:12,864 --> 16:28:15,531 and reuse that code, especially\nif more and more submissions are 17263 16:28:16,491 --> 16:28:18,891 But another theme of\ntoday, ultimately, is 17264 16:28:18,891 --> 16:28:22,980 that sometimes there are different,\n 17265 16:28:22,980 --> 16:28:24,772 And in fact, now at\nthis point in the term 17266 16:28:24,771 --> 16:28:27,651 as we begin to introduce not\njust Python, but in a moment 17267 16:28:27,652 --> 16:28:31,461 a language called SQL, and next\n 17268 16:28:31,461 --> 16:28:34,491 and the week after that, synthesizing\n 17269 16:28:34,491 --> 16:28:37,761 together is to just kind\nof paint a picture of how 17270 16:28:37,762 --> 16:28:41,242 you might decide what the trade-offs are\n 17271 16:28:42,171 --> 16:28:45,112 Because undoubtedly you can\nsolve problems moving forward 17272 16:28:45,112 --> 16:28:48,002 in many different ways\nwith many different tools. 17273 16:28:48,002 --> 16:28:50,362 So let's give you another\ntool, one with which 17274 16:28:50,362 --> 16:28:53,512 you can implement a proper\nrelational database. 17275 16:28:53,512 --> 16:28:56,391 What we just saw in\nthe form of CSV files 17276 16:28:56,391 --> 16:28:59,152 are what we might call\nflat-file databases. 17277 16:28:59,152 --> 16:29:02,842 Again, just a very simple file, flat\n 17278 16:29:04,612 --> 16:29:09,622 And that is all ultimately\nstoring ASCII or Unicode text. 17279 16:29:09,622 --> 16:29:12,742 A relational database, though,\nis something that's actually 17280 16:29:12,741 --> 16:29:16,191 closer to a proper spreadsheet program. 17281 16:29:16,192 --> 16:29:18,781 A CSV is an individual\nsheet, if you will 17282 16:29:18,781 --> 16:29:20,601 from a spreadsheet when you export it. 17283 16:29:20,601 --> 16:29:22,801 If you had multiple\nsheets in a spreadsheet 17284 16:29:22,802 --> 16:29:24,937 you would have to export multiple CSVs. 17285 16:29:24,936 --> 16:29:26,811 And that gets annoying\nquickly in code if you 17286 16:29:26,811 --> 16:29:29,331 have to open up this CSV,\nthis CSV, all of which 17287 16:29:29,332 --> 16:29:32,421 represent different sheets or\ntabs in a proper spreadsheet. 17288 16:29:32,421 --> 16:29:36,862 A relational database is more\nlike a spreadsheet program 17289 16:29:36,862 --> 16:29:39,982 that you, a programmer,\nnow can interact with. 17290 16:29:41,482 --> 16:29:45,022 You can read data from it, and you\n 17291 16:29:45,021 --> 16:29:47,491 tables storing all of your data. 17292 16:29:47,491 --> 16:29:49,581 So whereas Excel and numbers\nin Google spreadsheet 17293 16:29:49,582 --> 16:29:52,432 are meant to be reused really by humans\n 17294 16:29:52,432 --> 16:29:55,192 clicking, and pointing, and\nmanipulating things graphically 17295 16:29:55,192 --> 16:29:57,502 a relational database\nusing a language called 17296 16:29:57,502 --> 16:30:02,662 SQL is one in which the programmer\nhas similar capabilities 17297 16:30:04,341 --> 16:30:08,061 Specifically, using a language\ncalled SQL, and at a scale 17298 16:30:08,061 --> 16:30:11,011 that's much grander\nthan spreadsheets alone. 17299 16:30:11,012 --> 16:30:13,762 In fact, if you try on your Mac\n 17300 16:30:13,762 --> 16:30:16,432 got tens of thousands\nof rows, it'll probably 17301 16:30:16,432 --> 16:30:20,122 work fine, hundreds of thousands\n 17302 16:30:20,122 --> 16:30:22,342 At some point your Mac or\nPC is going to struggle 17303 16:30:22,341 --> 16:30:24,471 to open particularly large data sets. 17304 16:30:24,472 --> 16:30:26,961 And that, too, is where\nproper databases come 17305 16:30:26,961 --> 16:30:29,481 into play and proper\nlanguages for databases come 17306 16:30:29,482 --> 16:30:31,462 into play, when it's all about scale. 17307 16:30:31,461 --> 16:30:34,731 And indeed, most any mobile app or\n 17308 16:30:34,732 --> 16:30:38,762 might write should probably plan\n 17309 16:30:38,762 --> 16:30:41,072 So we need the right\ntools for that problem. 17310 16:30:41,072 --> 16:30:44,451 So fortunately, even though we're\n 17311 16:30:44,451 --> 16:30:49,701 it only does four things fundamentally,\n 17312 16:30:49,701 --> 16:30:53,211 SQL, this language for\ndatabases, supports the ability 17313 16:30:53,211 --> 16:30:57,741 to create data, read data,\nupdate data, and delete data. 17314 16:30:58,762 --> 16:31:02,031 There's a few more keywords that\n 17315 16:31:03,091 --> 16:31:04,799 But at the end of the\nday, even if you're 17316 16:31:04,800 --> 16:31:07,522 starting to feel like this\nis a lot very quickly 17317 16:31:07,521 --> 16:31:10,281 it all boils down to these\nfour basic operations. 17318 16:31:10,281 --> 16:31:12,981 And the four commands\nin SQL, if you will 17319 16:31:12,982 --> 16:31:17,122 functions in a sense that implement\n 17320 16:31:17,122 --> 16:31:19,612 They're almost the same but\nwith some slight variance. 17321 16:31:19,612 --> 16:31:24,622 The ability to create or insert data\n 17322 16:31:27,472 --> 16:31:30,389 Delete is the same, but drop\nis also a keyword as well. 17323 16:31:30,389 --> 16:31:32,182 So we'll see these and\na few other keywords 17324 16:31:32,182 --> 16:31:35,752 in SQL that, at the end of the day, just\n 17325 16:31:35,752 --> 16:31:39,652 data using verbs, if\nyou will, like these. 17326 16:31:39,652 --> 16:31:43,372 So to do that, what's\nthe syntax going to be? 17327 16:31:43,372 --> 16:31:45,632 Well, we won't get into the\nweeds too quickly on this. 17328 16:31:45,631 --> 16:31:47,991 But here's a representative\nsyntax of how 17329 16:31:47,991 --> 16:31:51,051 you can create using this\nlanguage called SQL, in your very 17330 16:31:51,052 --> 16:31:53,362 own database, a brand new table. 17331 16:31:53,362 --> 16:31:56,252 This is so easy in Excel, and Google\n 17332 16:31:56,252 --> 16:31:58,252 You want a new sheet, you\nclick the plus button. 17333 16:31:59,031 --> 16:32:00,832 You give it a name,\nand boom, you're done. 17334 16:32:00,832 --> 16:32:05,391 In the world of programming, though, if\n 17335 16:32:05,391 --> 16:32:08,781 spreadsheet in the computer's memory,\n 17336 16:32:08,781 --> 16:32:13,762 like a sheet, that has a name, and then\n 17337 16:32:13,762 --> 16:32:17,332 But unlike Google Spreadsheets,\nand Apple Numbers, and Excel 17338 16:32:17,332 --> 16:32:20,415 you have to decide as the\nprogrammer what types of data 17339 16:32:20,415 --> 16:32:22,582 you're going to be storing\nin each of these columns. 17340 16:32:22,582 --> 16:32:24,772 Now even though Excel,\nand Google Spreadsheets 17341 16:32:24,771 --> 16:32:28,651 and Numbers does allow you to format\n 17342 16:32:28,652 --> 16:32:33,022 it's not strongly typed data like it\n 17343 16:32:33,021 --> 16:32:35,541 And heck, even in Python\nthere's underlying data types. 17344 16:32:35,542 --> 16:32:37,500 Even if you don't have\nto type them explicitly 17345 16:32:37,500 --> 16:32:40,241 databases are going to want to\nknow, are you storing integers? 17346 16:32:40,241 --> 16:32:41,981 Are you storing real numbers or floats? 17347 16:32:43,482 --> 16:32:46,302 Because especially as your\ndata scales, the more hints 17348 16:32:46,302 --> 16:32:49,752 you give the database about your\n 17349 16:32:49,752 --> 16:32:52,841 the faster it can help you\nget at and store that data. 17350 16:32:52,841 --> 16:32:54,644 So types are about to\nbe important again 17351 16:32:54,644 --> 16:32:57,101 but there's not going to be\nthat many of them, fortunately. 17352 16:32:57,101 --> 16:32:59,981 Now how can I go about converting,\nfor instance, some real data 17353 16:32:59,982 --> 16:33:02,832 like that from you,\nmy favorites.csv file 17354 16:33:02,832 --> 16:33:04,781 into a proper relational database? 17355 16:33:04,781 --> 16:33:07,991 Well, it turns out that\nusing SQL I can do this 17356 16:33:07,991 --> 16:33:10,601 in VS Code on my own Mac,\nor PC, or in the cloud 17357 16:33:10,601 --> 16:33:13,796 here by just importing\nthe CSV into a database. 17358 16:33:13,796 --> 16:33:15,671 We'll see eventually\nhow to do this manually. 17359 16:33:15,671 --> 16:33:17,963 For now, I'm going to use\nmore of an automated process. 17360 16:33:17,963 --> 16:33:20,021 So let me go over to VS Code here. 17361 16:33:20,021 --> 16:33:22,511 Let me type ls to see\nwhere we left off before. 17362 16:33:22,512 --> 16:33:26,347 I had two files favorites.csv, which\n 17363 16:33:26,347 --> 16:33:27,972 Recall that I made a couple of changes. 17364 16:33:27,972 --> 16:33:31,391 We deleted a couple of Thes\nfrom the file for The Office. 17365 16:33:31,391 --> 16:33:33,942 But this is the same file\nas before, and then we 17366 16:33:33,942 --> 16:33:36,552 have favorites.py, which\nwe'll set aside for now. 17367 16:33:36,552 --> 16:33:40,212 I'm going to go ahead now\nand run a command SQLite3. 17368 16:33:40,211 --> 16:33:43,362 So in the world of\nrelational databases, there's 17369 16:33:43,362 --> 16:33:48,372 many different products out there,\nmany different software that 17370 16:33:48,372 --> 16:33:50,711 implements the SQL language. 17371 16:33:51,942 --> 16:33:55,422 There's something called MySQL\n 17372 16:33:55,421 --> 16:33:57,461 Facebook, for instance,\nused it early on. 17373 16:33:57,461 --> 16:34:00,372 PostgreSQL, Microsoft\nAccess Server, Oracle 17374 16:34:00,372 --> 16:34:02,300 and maybe a whole bunch\nof other product names 17375 16:34:02,300 --> 16:34:04,092 you might have encountered\nover time, which 17376 16:34:04,091 --> 16:34:08,322 is to say there's many different\ntypes of tools, and servers 17377 16:34:08,322 --> 16:34:10,332 and software in which you can use SQL. 17378 16:34:10,332 --> 16:34:13,122 We're going to use a very lightweight\n 17379 16:34:14,711 --> 16:34:17,021 This is the version of\nSQL that's generally 17380 16:34:17,021 --> 16:34:19,361 used on iPhones and\nAndroid devices these days. 17381 16:34:19,362 --> 16:34:22,272 If you download an app that stores\ndata like your own contacts 17382 16:34:22,271 --> 16:34:24,341 typically is stored using SQLite. 17383 16:34:24,341 --> 16:34:28,051 Because it's fairly lightweight,\n 17384 16:34:28,052 --> 16:34:31,152 thousands, even tens of\nthousands of pieces of data 17385 16:34:31,152 --> 16:34:33,312 even using this lightweight\nversion thereof. 17386 16:34:33,311 --> 16:34:36,131 SQLite3 is like version 3 of this tool. 17387 16:34:36,131 --> 16:34:41,682 We're going to go ahead and run SQLite3\n 17388 16:34:41,682 --> 16:34:45,461 It's conventional in the world of\n 17389 16:34:45,461 --> 16:34:47,832 I'm going to create a\ndatabase called favorites.db. 17390 16:34:47,832 --> 16:34:52,351 Once I'm inside of the program, now I'm\n 17391 16:34:52,351 --> 16:34:54,101 Again, not something\nyou have to memorize 17392 16:34:54,101 --> 16:34:55,809 just something you\ncan look up as needed. 17393 16:34:55,809 --> 16:34:59,441 And then, I'm going to\nimport favorites.csv 17394 16:34:59,442 --> 16:35:05,602 into a table, that is, a sheet, if\n 17395 16:35:05,601 --> 16:35:09,371 Now I'm going to hit Enter and I'm\n 17396 16:35:10,991 --> 16:35:13,511 Now I have three files\nin my current directory-- 17397 16:35:13,512 --> 16:35:17,472 the CSV file, the Python file\nfrom before, and now favorites.db. 17398 16:35:17,472 --> 16:35:21,522 But if I did this right, all of the\n 17399 16:35:21,521 --> 16:35:25,182 has now been loaded into a proper\ndatabase where I can now use 17400 16:35:25,182 --> 16:35:28,521 this SQL language to access it instead. 17401 16:35:28,521 --> 16:35:33,072 So let's go ahead again and run SQLite3\n 17402 16:35:33,072 --> 16:35:35,982 And now, at the SQLite\nprompt I can start 17403 16:35:35,982 --> 16:35:38,502 to play around and\nsee what this data is. 17404 16:35:38,502 --> 16:35:41,951 For instance, I can\nlook, by typing .schema 17405 16:35:41,951 --> 16:35:44,703 at what the schema is of\nmy data, what's the design. 17406 16:35:44,703 --> 16:35:47,411 Now no thought was put into the\n 17407 16:35:47,411 --> 16:35:49,241 because I automated the whole process. 17408 16:35:49,241 --> 16:35:52,091 Once we start creating\nour own databases we'll 17409 16:35:52,091 --> 16:35:55,091 give more thought to the data\n 17410 16:35:55,091 --> 16:35:59,561 But we can see what SQLite\npresumed I wanted just 17411 16:35:59,561 --> 16:36:01,871 by importing the data by default. 17412 16:36:01,872 --> 16:36:06,461 What the import command did for me a\n 17413 16:36:06,461 --> 16:36:09,851 It automated the process of creating\n 17414 16:36:11,112 --> 16:36:14,322 And then notice, in parentheses\nit gave me three columns-- 17415 16:36:14,322 --> 16:36:18,701 timestamp, title, and genres, which\n 17416 16:36:18,701 --> 16:36:21,341 All three of which have\nbeen decreed to be text. 17417 16:36:21,341 --> 16:36:24,521 Again, once we're more comfortable\nwe'll create our own tables 17418 16:36:24,521 --> 16:36:26,351 choose our own types and column names. 17419 16:36:26,351 --> 16:36:28,691 But for now, I just automated\nthe whole process just 17420 16:36:28,692 --> 16:36:33,461 to get us started by using this\nbuilt-in import command as well. 17421 16:36:34,152 --> 16:36:36,972 So what now can I begin to do? 17422 16:36:36,972 --> 16:36:42,252 Well, if I wanted to, for instance,\n 17423 16:36:42,252 --> 16:36:44,936 I might execute a couple\nof different commands. 17424 16:36:48,341 --> 16:36:53,762 Let me find the right one here--\none of which would be select. 17425 16:36:53,762 --> 16:36:56,951 Select being one of our\nmost versatile tools 17426 16:36:56,951 --> 16:36:58,521 to select data from this database. 17427 16:36:58,521 --> 16:37:01,061 So if I have these three\ncolumns here-- timestamp 17428 16:37:01,061 --> 16:37:04,362 title, and genres, suppose I\nwant to select all of the titles. 17429 16:37:04,362 --> 16:37:09,131 Doing that earlier in Python\nrequired importing the CSV library 17430 16:37:09,131 --> 16:37:14,081 opening the file, creating a reader or\n 17431 16:37:14,082 --> 16:37:16,842 adding every title to a dictionary\nor just printing it out 17432 16:37:17,652 --> 16:37:20,512 There was a dozen or so lines\nof code when we first began. 17433 16:37:22,182 --> 16:37:26,561 Select title from\nfavorites, semicolon, done. 17434 16:37:26,561 --> 16:37:30,911 So now, with this particular\n 17435 16:37:30,911 --> 16:37:34,271 and it's simulating what it looks like\n 17436 16:37:36,211 --> 16:37:39,421 Select title from\nfavorites is a distillation 17437 16:37:39,421 --> 16:37:42,871 in a different language called\nSQL of all the lines of code 17438 16:37:42,872 --> 16:37:46,082 I wrote early on when we first\n 17439 16:37:46,082 --> 16:37:50,882 SQL is therefore optimized for\n 17440 16:37:50,881 --> 16:37:52,841 and ultimately, deleting data. 17441 16:37:52,841 --> 16:37:56,041 So here's perhaps a better tool\n 17442 16:37:56,042 --> 16:37:59,372 Tossing it into a more\npowerful, versatile format 17443 16:37:59,372 --> 16:38:02,569 might allow you now to get\nmore work done more quickly 17444 16:38:02,569 --> 16:38:04,112 without having to reinvent the wheel. 17445 16:38:04,112 --> 16:38:06,851 Someone else has figured out\nhow to select data like this. 17446 16:38:09,101 --> 16:38:12,391 Well, let me go ahead and pull\n 17447 16:38:14,732 --> 16:38:19,182 Give me one second to find this. 17448 16:38:19,182 --> 16:38:23,432 So suppose I want to now select\ndata a little more powerfully. 17449 16:38:23,432 --> 16:38:25,561 So here's what I just\ndid in a canonical way. 17450 16:38:25,561 --> 16:38:27,061 So select typically works like this. 17451 16:38:27,061 --> 16:38:31,201 You select columns from a\nspecific table, semicolon. 17452 16:38:31,201 --> 16:38:33,601 Unfortunately, stupid\nsemicolons are back. 17453 16:38:33,601 --> 16:38:38,051 Select columns from table then, is\n 17454 16:38:38,052 --> 16:38:42,463 More specifically, I selected one\n 17455 16:38:42,463 --> 16:38:43,921 Favorites is the name of the table. 17456 16:38:45,031 --> 16:38:48,781 Suppose I wanted to get two things, like\n 17457 16:38:48,781 --> 16:38:53,762 I could instead do select title,\ncomma, genres from favorites 17458 16:38:53,762 --> 16:38:55,562 and then, a semicolon, and Enter. 17459 16:38:55,561 --> 16:38:57,451 It's going to look a\nlittle ugly on my screen 17460 16:38:57,451 --> 16:38:59,011 because some of these titles and-- 17461 16:38:59,012 --> 16:39:02,641 OK, one of you really went\nall out with Community. 17462 16:39:02,641 --> 16:39:06,002 You can see that it's just\nwrapping in an ugly way 17463 16:39:06,002 --> 16:39:08,641 but it's just now\nshowing me two columns. 17464 16:39:08,641 --> 16:39:12,182 If we scroll up to the very top\nagain, the left most of one 17465 16:39:12,182 --> 16:39:13,622 Black Mirror went all out, too. 17466 16:39:14,491 --> 16:39:17,341 And now, OK, we're going to\nhave to clean some of these up. 17467 16:39:17,341 --> 16:39:19,456 Game of Thrones, good comedy, yes. 17468 16:39:22,891 --> 16:39:24,822 Keep going, keep going, keep going. 17469 16:39:24,822 --> 16:39:28,211 So now we've selected two of\nthe columns that we care about. 17470 16:39:28,711 --> 16:39:31,722 OK, so it's crazy wide because\nof all of those genres. 17471 16:39:31,722 --> 16:39:34,476 But it allows me to select\nexactly the data I want. 17472 16:39:34,476 --> 16:39:37,351 Let's go back to the titles, though,\n 17473 16:39:39,091 --> 16:39:43,651 For instance, it turns out, using\n 17474 16:39:45,241 --> 16:39:48,184 You've got a lot of functions, similar\n 17475 16:39:48,184 --> 16:39:49,351 where you can have formulas. 17476 16:39:49,351 --> 16:39:51,661 SQL provides you with some\nof the same heuristics that 17477 16:39:51,661 --> 16:39:55,691 allow you to apply operations\nlike these on entire columns. 17478 16:39:55,692 --> 16:39:58,262 For instance, you can take\naverages, count the total 17479 16:39:58,262 --> 16:40:01,351 get the distinct values, force\n 17480 16:40:02,561 --> 16:40:04,951 So let's try distinct, for instance. 17481 16:40:04,951 --> 16:40:08,791 Let me go back to my Terminal,\nand let's say, select 17482 16:40:08,792 --> 16:40:14,101 how about the distinct titles\nfrom the favorites table? 17483 16:40:14,855 --> 16:40:16,772 I didn't bother selecting\nthe genres because I 17484 16:40:16,771 --> 16:40:18,104 want it to be a little prettier. 17485 16:40:18,105 --> 16:40:23,432 And you can see here that we\nhave just the distinct titles 17486 16:40:23,432 --> 16:40:25,889 except for issues of formatting. 17487 16:40:25,889 --> 16:40:27,722 So whitespace is going\nto be an issue again. 17488 16:40:27,722 --> 16:40:29,555 Capitalization is going\nto be a thing again. 17489 16:40:30,572 --> 16:40:34,622 One of the things I was doing in Python\n 17490 16:40:34,622 --> 16:40:36,272 and then getting rid of whitespace. 17491 16:40:36,271 --> 16:40:37,741 But we could combine some of these. 17492 16:40:37,741 --> 16:40:40,682 I could do something like\nforce every title to uppercase 17493 16:40:40,682 --> 16:40:41,849 then get the distinct value. 17494 16:40:41,849 --> 16:40:44,724 And that's actually going to get\n 17495 16:40:44,724 --> 16:40:47,202 And again, I did it all in\none simple line that was fast. 17496 16:40:47,201 --> 16:40:49,368 So let me pull up at the\nbottom of the screen again. 17497 16:40:49,368 --> 16:40:53,432 I selected distinct upper\ntitles from favorites 17498 16:40:53,432 --> 16:40:56,412 and that did everything for\nme at once in just one breath. 17499 16:40:56,411 --> 16:40:58,981 Suppose I want to get the total\nnumber of counts of titles. 17500 16:40:58,982 --> 16:41:05,492 How about select count of all\nof those titles from favorites? 17501 16:41:05,491 --> 16:41:09,331 Semicolon, Enter, and now\nyou get back a mini table 17502 16:41:09,332 --> 16:41:13,302 that contains just your\nanswer, 158 in this case. 17503 16:41:13,302 --> 16:41:15,902 So that's the total\nnumber of, not distinct 17504 16:41:15,902 --> 16:41:18,031 but total titles that\nwe had in the file. 17505 16:41:18,031 --> 16:41:21,902 And we could continue to manipulate\n 17506 16:41:23,792 --> 16:41:26,891 But there's also additional\nfiltration we can do. 17507 16:41:26,891 --> 16:41:32,351 We can also qualify our selections by\n 17508 16:41:32,351 --> 16:41:35,972 So just as in Scratch, and C, and\n 17509 16:41:35,972 --> 16:41:41,612 you can have the same in SQL as well,\n 17510 16:41:44,732 --> 16:41:46,682 Like allows me to do approximations. 17511 16:41:46,682 --> 16:41:48,842 If I want to get something\nthat's like The Office 17512 16:41:48,841 --> 16:41:51,661 but not necessarily\nT-H-E, space, Office 17513 16:41:51,661 --> 16:41:54,781 I could do pattern\nmatching using like here. 17514 16:41:54,781 --> 16:41:58,512 Order by, limit, and grouped by are\n 17515 16:41:58,512 --> 16:42:01,322 So let me go back and do\na couple of these here. 17516 16:42:01,322 --> 16:42:07,951 How about, let me just get, oh, I don't\n 17517 16:42:07,951 --> 16:42:10,189 but limit it to 10 results. 17518 16:42:10,190 --> 16:42:13,232 That might be one thing that's helpful\n 17519 16:42:13,232 --> 16:42:15,452 of the data at the top there instead. 17520 16:42:15,451 --> 16:42:21,871 How about, select all of the titles\n 17521 16:42:21,872 --> 16:42:25,052 is like, quote, unquote, "Office? 17522 16:42:25,052 --> 16:42:28,082 And this will give me only two answers. 17523 16:42:28,082 --> 16:42:32,522 Those are the two rows, recall, that I\n 17524 16:42:32,521 --> 16:42:37,221 Notice that like allows me too\ntolerate uppercase and lowercase. 17525 16:42:37,222 --> 16:42:40,222 Because if I instead\njust use the equal sign 17526 16:42:40,222 --> 16:42:46,402 and in SQL a single equal sign\ndoes, in fact, mean equality. 17527 16:42:46,402 --> 16:42:48,711 For comparison's sake,\nit's not doing assignment. 17528 16:42:48,711 --> 16:42:51,411 This is not how you assign data in SQL. 17529 16:42:51,411 --> 16:42:53,281 I got back no answers there. 17530 16:42:53,281 --> 16:42:56,961 So indeed, the equal sign\nis giving me literal answers 17531 16:42:56,961 --> 16:42:59,332 that searches just for what I typed in. 17532 16:42:59,332 --> 16:43:00,711 How could I get all of these? 17533 16:43:00,711 --> 16:43:04,432 Well, similar in spirit to regular\n 17534 16:43:04,432 --> 16:43:06,961 in SQL, I could do something like this. 17535 16:43:06,961 --> 16:43:10,671 I can select the title from favorites\n 17536 16:43:12,052 --> 16:43:17,792 But I can add, a bit weirdly, percent\n 17537 16:43:17,792 --> 16:43:23,272 So the language SQL supports the\nsame notion of pattern matching 17538 16:43:23,271 --> 16:43:24,888 but much more limited out of the box. 17539 16:43:24,889 --> 16:43:26,722 If we want more powerful\nregular expressions 17540 16:43:26,722 --> 16:43:28,762 we probably do want\nto use Python instead. 17541 16:43:28,762 --> 16:43:32,062 But the percent sign here\nmeans 0 or more characters 17542 16:43:32,061 --> 16:43:34,682 on the left, 0 or more\ncharacters on the right. 17543 16:43:34,682 --> 16:43:39,802 So this will just grab any title that\n 17544 16:43:40,442 --> 16:43:44,632 And now I get all 16, it would\nseem, of those results, again. 17545 16:43:45,832 --> 16:43:48,502 Well, I can just get the\ncount of those titles 17546 16:43:48,502 --> 16:43:51,482 and get back that\nanswer instead as well. 17547 16:43:51,482 --> 16:43:54,982 So again, it takes some\ngetting used to, the vocabulary 17548 16:43:54,982 --> 16:43:56,315 and the syntax that you can use. 17549 16:43:56,315 --> 16:43:58,024 There's these building\nblocks and others. 17550 16:43:58,023 --> 16:44:00,841 But SQL is really designed, again,\n 17551 16:44:01,822 --> 16:44:06,421 For instance, I've never really\n 17552 16:44:06,421 --> 16:44:12,201 So right now if I do select,\nhow about title from favorites 17553 16:44:12,201 --> 16:44:18,682 where title like, quote, unquote,\n 17554 16:44:18,682 --> 16:44:20,671 We can see that there's\na whole bunch of them. 17555 16:44:21,862 --> 16:44:23,302 Let's just do a quick count. 17556 16:44:25,232 --> 16:44:28,642 Well, delete from favorites. 17557 16:44:28,641 --> 16:44:35,841 OK, you and me, delete from favorites,\n 17558 16:44:35,841 --> 16:44:39,621 Nothing seems to happen,\nbut bye-bye Friends. 17559 16:44:44,451 --> 16:44:46,731 So now we've actually changed the data. 17560 16:44:46,732 --> 16:44:50,452 And this is what's compelling\nabout a proper database. 17561 16:44:50,451 --> 16:44:54,661 Yes, you could technically write Python\n 17562 16:44:55,881 --> 16:44:58,161 You can change using quote,\nunquote, "A" for append 17563 16:44:58,161 --> 16:45:01,251 or quote, unquote, "W" for\nwrite, instead of quote, unquote 17564 16:45:02,605 --> 16:45:05,272 But it's definitely a little more\ninvolved to do that in Python. 17565 16:45:05,271 --> 16:45:07,591 But with SQL, you can update\nthe data in real time. 17566 16:45:07,591 --> 16:45:11,091 And if I were actually running a\n 17567 16:45:11,091 --> 16:45:13,221 for a mobile app, that\nchange, theoretically 17568 16:45:13,222 --> 16:45:15,502 would be reflected everywhere\non your own devices 17569 16:45:15,502 --> 16:45:17,552 if you're somehow talking\nto this application. 17570 16:45:17,552 --> 16:45:19,337 So that's the direction we're headed. 17571 16:45:19,336 --> 16:45:20,961 This other thing has been bothering me. 17572 16:45:20,961 --> 16:45:27,981 So select, how about title from\nfavorites, where title equals 17573 16:45:33,021 --> 16:45:37,491 How about we update\nfavorites by setting title 17574 16:45:37,491 --> 16:45:44,421 equal to The Office, where title\n 17575 16:45:45,811 --> 16:45:47,991 And now, if I select\nthe same thing again 17576 16:45:47,991 --> 16:45:50,152 I can go up and down with\nmy arrow keys quickly. 17577 16:45:50,152 --> 16:45:52,432 Now there is no The V Office. 17578 16:45:52,432 --> 16:45:54,542 We've actually changed that value. 17579 16:45:55,641 --> 16:46:01,222 Select genres from favorites,\nwhere the title is title 17580 16:46:01,222 --> 16:46:04,792 equals Game of Thrones, semicolon. 17581 16:46:04,792 --> 16:46:08,342 These were kind of long, and I\n 17582 16:46:08,341 --> 16:46:14,901 So how about we update favorites,\nset genres equal to, sure 17583 16:46:14,902 --> 16:46:17,991 action, adventure, sure, drama? 17584 16:46:19,732 --> 16:46:22,042 Fantasy, sure, thriller, war. 17585 16:46:22,042 --> 16:46:26,391 OK, anything really but\ncomedy, I would say. 17586 16:46:26,391 --> 16:46:28,502 Let's go ahead and hit Enter now. 17587 16:46:28,502 --> 16:46:33,141 And now, if I select genres again, same\n 17588 16:46:34,502 --> 16:46:36,591 So whether or not that\nis right is probably 17589 16:46:36,591 --> 16:46:38,361 a bit subjective and argumentative. 17590 16:46:38,362 --> 16:46:42,262 But I have at least cleaned up my\n 17591 16:46:42,262 --> 16:46:46,012 Create, read, update, delete,\nyou can do it that easily. 17592 16:46:47,631 --> 16:46:51,771 Beware worse using drop, whereby\nyou can drop an entire table. 17593 16:46:51,771 --> 16:46:54,651 But via these kinds of\ncommands, can we actually now 17594 16:46:54,652 --> 16:46:58,732 manipulate our data much more\nrapidly and with single thoughts. 17595 16:46:58,732 --> 16:47:01,732 And in fact, if you're an aspiring\n 17596 16:47:01,732 --> 16:47:05,662 or analyst in the real world, SQL\n 17597 16:47:05,661 --> 16:47:08,914 because it allows you to really\ndive into data quickly, and ask 17598 16:47:08,915 --> 16:47:11,332 questions of the data, and get\nback answers quite quickly. 17599 16:47:11,332 --> 16:47:12,872 And this is a simple data set. 17600 16:47:12,872 --> 16:47:17,182 You can do this with much larger\ndata sets as we soon will, too. 17601 16:47:17,182 --> 16:47:20,391 Or any questions on what\nwe've seen of SQL thus far? 17602 16:47:20,391 --> 16:47:22,512 Only scratched the\nsurface, but again, it 17603 16:47:22,512 --> 16:47:28,442 boils down to creating, reading,\nupdating, and deleting data. 17604 16:47:30,752 --> 16:47:33,162 Well, let's consider\nthe design of this data. 17605 16:47:33,161 --> 16:47:37,241 Recall that if I do .schema, that\n 17606 16:47:37,241 --> 16:47:39,331 the so-called schema of my data. 17607 16:47:40,351 --> 16:47:42,991 It gets the job done, and frankly,\neverything the user typed in 17608 16:47:42,991 --> 16:47:46,951 was arguably text, including the\n 17609 16:47:46,951 --> 16:47:49,381 But so the data set\nitself is somewhat simple. 17610 16:47:49,381 --> 16:47:54,911 But if we look at the data set itself,\n 17611 16:47:54,911 --> 16:47:57,302 Select genres from favorites. 17612 16:47:57,302 --> 16:47:59,882 And let me point out one other\nthing stylistically, too. 17613 16:47:59,881 --> 16:48:04,391 I am very deliberately capitalizing\n 17614 16:48:04,391 --> 16:48:08,042 and I'm lowercasing all of the\ncolumn names and the table names. 17615 16:48:08,042 --> 16:48:11,101 This is a convention, and\nhonestly, it just helps you read 17616 16:48:11,101 --> 16:48:14,551 I think, the code when you're\nco-mingling your names for columns 17617 16:48:14,552 --> 16:48:17,942 and tables with proper SQL keywords. 17618 16:48:17,942 --> 16:48:23,370 But I could just as easily do\nselect genres from favorites 17619 16:48:23,370 --> 16:48:26,491 but again, the SQL specific keywords\n 17620 16:48:26,491 --> 16:48:29,582 So stylistically, we would\nrecommend this, selecting genres 17621 16:48:38,192 --> 16:48:40,562 I accidentally made\nevery show, including 17622 16:48:40,561 --> 16:48:45,752 The Office about action, adventure,\n 17623 16:48:45,752 --> 16:48:49,802 How did I do that accidentally? 17624 16:48:57,802 --> 16:48:59,701 I think I did say\nbeware around this time. 17625 16:48:59,701 --> 16:49:03,152 So the SQL database took me--\nliterally, I updated favorites 17626 16:49:03,152 --> 16:49:06,351 setting genres equal to that,\nsemicolon, end of thought. 17627 16:49:06,351 --> 16:49:08,841 I really wanted to say\nwhere title equals 17628 16:49:08,841 --> 16:49:11,361 quote, unquote, "Game of Thrones. 17629 16:49:11,362 --> 16:49:14,421 Unfortunately, there isn't an\nundo command or time machine 17630 16:49:14,421 --> 16:49:17,271 with a SQL database, so\nthe best we can do here 17631 16:49:17,271 --> 16:49:21,591 is, let's actually get\nrid of favorites.db. 17632 16:49:21,591 --> 16:49:27,681 Let's run SQLite of favorites.db\n 17633 16:49:27,682 --> 16:49:29,781 Let me change myself into CSV mode. 17634 16:49:29,781 --> 16:49:35,271 Let me import, into my\nfavorites table, the CSV file. 17635 16:49:35,271 --> 16:49:39,591 And now, Friends is back,\nfor better or for worse 17636 16:49:39,591 --> 16:49:40,851 but so are all of our genres. 17637 16:49:43,432 --> 16:49:46,671 If I now reload the file\nand do select, star, from-- 17638 16:49:47,182 --> 16:49:51,180 Select genres from favorites,\nthat was the result I was getting. 17639 16:49:51,180 --> 16:49:53,972 It's much messier, but that's\n 17640 16:49:53,972 --> 16:49:55,639 But now we're back to the original data. 17641 16:49:55,639 --> 16:49:58,042 Lesson here, be sure\nto back up your work. 17642 16:49:58,582 --> 16:50:02,192 So what more can we\nnow do with this data? 17643 16:50:02,192 --> 16:50:05,822 Well, I don't love the design of the\n 17644 16:50:05,822 --> 16:50:08,542 One, we didn't have\nany sort of validation 17645 16:50:08,542 --> 16:50:10,552 but user input is going to be messy. 17646 16:50:10,552 --> 16:50:13,132 There's just a lot of\nredundancy in here. 17647 16:50:15,061 --> 16:50:17,101 Let me select all the\ncomedies you all typed in. 17648 16:50:17,101 --> 16:50:23,301 So select title from\nfavorites, where genres equals 17649 16:50:25,461 --> 16:50:31,072 OK, so there's all of the shows\nthat are explicitly comedies. 17650 16:50:31,072 --> 16:50:34,281 But I think there might\nactually be others. 17651 16:50:37,521 --> 16:50:39,381 What was a comedy and a drama? 17652 16:50:39,381 --> 16:50:44,481 How about let's search for the-- oops,\n 17653 16:50:44,482 --> 16:50:49,042 OK, so The Office, in this case, was\n 17654 16:50:49,042 --> 16:50:52,351 It's Always Sunny in Philadelphia,\nand Gilmore Girls as well. 17655 16:50:52,351 --> 16:50:56,792 But notice that I get many more\nwhen I just search for comedy. 17656 16:50:56,792 --> 16:51:01,792 So the catch here is that, because I\n 17657 16:51:01,792 --> 16:51:04,372 the way Google did, as\na comma-separated list 17658 16:51:04,372 --> 16:51:08,932 it's actually really hard and messy\n 17659 16:51:08,932 --> 16:51:12,171 that are somewhere described as comedy. 17660 16:51:12,171 --> 16:51:15,021 Because if I search for quote,\n 17661 16:51:15,021 --> 16:51:18,951 I'm going to get are this one, whatever\n 17662 16:51:20,582 --> 16:51:22,012 But I'm not going to get this one. 17663 16:51:22,012 --> 16:51:23,512 I'm not going to get this one. 17664 16:51:24,411 --> 16:51:28,131 If I'm searching for, where genres\n 17665 16:51:28,131 --> 16:51:29,691 why am I missing those other shows? 17666 16:51:37,362 --> 16:51:39,732 It's not just a comedy,\nit's a comedy and a drama 17667 16:51:39,732 --> 16:51:42,022 and a comedy or a news\nshow, and so forth. 17668 16:51:42,021 --> 16:51:45,851 So I have to search for these commas,\n 17669 16:51:45,851 --> 16:51:47,902 Let me copy this so I can do this. 17670 16:51:47,902 --> 16:51:51,491 Let me search for where\ngenres equals comedy. 17671 16:51:51,491 --> 16:51:58,572 How about, or genres equals\ncomedy, drama, or genres 17672 16:51:58,572 --> 16:52:01,991 equals this whole thing,\ncomedy, news, talk show? 17673 16:52:01,991 --> 16:52:03,711 I'm going to get more and more results. 17674 16:52:03,711 --> 16:52:05,241 But that's not going to scale well. 17675 16:52:05,241 --> 16:52:08,152 What could I do instead\nof enumerating with ors 17676 16:52:08,152 --> 16:52:11,141 all of the different permutations\nof genres, do you think? 17677 16:52:15,872 --> 16:52:19,772 So I could use the keyword is,\nsimilar in Python to the word in. 17678 16:52:19,771 --> 16:52:22,322 I could use the like\nkeyword so that so long 17679 16:52:22,322 --> 16:52:27,421 as the genres is like\ncomedy somewhere in there 17680 16:52:27,421 --> 16:52:31,241 that's going to give me all of them,\n 17681 16:52:31,241 --> 16:52:34,606 But let me go ahead and just\nopen the form from earlier. 17682 16:52:37,591 --> 16:52:40,414 Let me see if I can open this\nreal quick before I toggle over. 17683 16:52:40,415 --> 16:52:42,332 If we look back at the\nform, recall that there 17684 16:52:42,332 --> 16:52:47,972 were all of those radio buttons\nasking for the specific genres 17685 16:52:49,872 --> 16:52:55,052 And if I open this, let me full screen\n 17686 16:52:55,052 --> 16:52:58,262 You'll see all of the\ngenres here, none of which 17687 16:52:58,262 --> 16:53:04,022 are that worrisome except for a\n 17688 16:53:04,021 --> 16:53:08,671 Where might the like keyword\nalone get me into trouble? 17689 16:53:13,021 --> 16:53:16,531 DAVID J. MALAN: Yeah, music and musical\n 17690 16:53:16,531 --> 16:53:19,002 Because, one, they're separate genres. 17691 16:53:19,002 --> 16:53:21,652 But if I just search for\nsomething that's like music 17692 16:53:21,652 --> 16:53:24,152 I'm going to accidentally suck\nin all of the musicals, which 17693 16:53:25,292 --> 16:53:28,652 If music is a music video or\nwhatever, and musical is actually 17694 16:53:28,652 --> 16:53:31,961 a different type of show, I\ndon't want to just do that. 17695 16:53:31,961 --> 16:53:33,451 So it seems just very messy. 17696 16:53:33,451 --> 16:53:37,023 I could probably hack something together\n 17697 16:53:37,982 --> 16:53:40,862 But this is just not a\ngood design for the data. 17698 16:53:40,862 --> 16:53:43,232 Google has done it this\nway because it's just 17699 16:53:43,232 --> 16:53:47,342 simple to actually keep the user's\ndata all in a single column 17700 16:53:47,341 --> 16:53:49,861 and just as they did,\nseparate it by commas. 17701 16:53:49,862 --> 16:53:54,061 But this is a real\nmessy way to use CSV is 17702 16:53:54,061 --> 16:53:58,171 by putting comma-separated values\n 17703 16:53:58,171 --> 16:54:00,691 Arguably, the folks at\nGoogle probably just did this 17704 16:54:01,862 --> 16:54:03,987 And they didn't want to\ngive people multiple sheets 17705 16:54:03,987 --> 16:54:07,561 or complicate things using some other\n 17706 16:54:07,561 --> 16:54:09,792 But I bet there's a better\nway for us to do this. 17707 16:54:09,792 --> 16:54:11,202 And let me go ahead and do this. 17708 16:54:11,201 --> 16:54:13,319 Let me go back into my code here. 17709 16:54:13,319 --> 16:54:15,362 And in just a moment, I'm\ngoing to grab a program 17710 16:54:15,362 --> 16:54:19,592 that I wrote in advance that's going\n 17711 16:54:19,591 --> 16:54:24,451 iterate over all of the rows, and load\n 17712 16:54:24,451 --> 16:54:27,971 two tables, one called\nshows, and one called genres 17713 16:54:27,972 --> 16:54:30,482 so as to actually separate\nthese two things out. 17714 16:54:30,482 --> 16:54:33,072 Give me just a moment to grab the code. 17715 16:54:33,072 --> 16:54:36,061 And when I run this, I'll\nonly have to run it once. 17716 16:54:36,061 --> 16:54:38,432 Let me go ahead and\nrun Python in a moment 17717 16:54:38,432 --> 16:54:41,281 and I'll reveal the results in a sec. 17718 16:54:41,281 --> 16:54:44,131 This is going to be version\n8 of the code online. 17719 16:54:44,131 --> 16:54:47,822 When I do this, let me go\nahead and open up this file. 17720 16:54:47,822 --> 16:54:51,601 Give me a second to move\nit into this directory. 17721 16:54:53,911 --> 16:54:56,856 So here we have version 8 of\nthis that's available online 17722 16:54:56,857 --> 16:54:58,232 that's going to do the following. 17723 16:54:58,232 --> 16:55:00,190 And I'll gloss over some\nof the details just so 17724 16:55:00,190 --> 16:55:04,082 that we don't get stuck in the\nweeds of some of this code. 17725 16:55:04,082 --> 16:55:06,722 I'm going to be using, at\nthe top of this program 17726 16:55:06,722 --> 16:55:11,162 as we'll soon see, a CS50 library,\n 17727 16:55:11,161 --> 16:55:14,072 or get_int, or get_float, but\nbecause there's some built-in SQL 17728 16:55:14,072 --> 16:55:17,222 functionality that we didn't discuss\n 17729 16:55:18,091 --> 16:55:22,021 But inside of the CS50 library we'll\n 17730 16:55:22,021 --> 16:55:26,671 SQL that gives you the ability using\n 17731 16:55:26,671 --> 16:55:31,381 technically called a URI, that allows\n 17732 16:55:31,381 --> 16:55:33,721 And long story short, all\nof the subsequent code 17733 16:55:33,722 --> 16:55:37,921 is going to iterate over this\n 17734 16:55:37,921 --> 16:55:41,582 And it's going to import it\ninto the SQLite database 17735 16:55:41,582 --> 16:55:44,772 but it's going to use two\ntables instead of just one. 17736 16:55:44,771 --> 16:55:46,981 So give me just a moment\nto run this, and then I'll 17737 16:55:48,942 --> 16:55:51,612 This is going to be\nrun on favorites.csv. 17738 16:55:56,851 --> 16:56:00,826 And taking a look here,\ngive me just a moment. 17739 16:56:11,421 --> 16:56:14,362 This program should not\nbe taking this long. 17740 16:56:24,841 --> 16:56:27,691 Let me just skim this code real\n 17741 16:56:30,182 --> 16:56:35,732 Reader, title, show ID\nin certain two shows. 17742 16:56:35,732 --> 16:56:40,712 [INAUDIBLE] genres split, DB execute. 17743 16:56:41,222 --> 16:56:42,752 This is me debugging in real time. 17744 16:56:42,752 --> 16:56:48,184 All those times we encourage you to use\n 17745 16:56:48,184 --> 16:56:50,101 We'll see how quickly I\ncan recover from this. 17746 16:56:50,101 --> 16:56:51,902 Python of favorites version 8. 17747 16:56:54,442 --> 16:56:57,412 OK, so here's me debugging in real time. 17748 16:56:58,224 --> 16:56:59,932 Oh, maybe I just didn't\nwait long enough. 17749 16:57:01,281 --> 16:57:05,241 What I'm doing is printing out\nthe dictionary that represents 17750 16:57:05,241 --> 16:57:06,883 each row that you all typed in. 17751 16:57:06,883 --> 16:57:08,341 And we're actually making progress. 17752 16:57:09,561 --> 16:57:11,851 I was too impatient and\ndidn't wait long enough. 17753 16:57:13,311 --> 16:57:15,711 All right, so all we have\nto do sometimes is wait. 17754 16:57:15,711 --> 16:57:19,762 Let me go ahead now and open\nthis file using SQLite3. 17755 16:57:19,762 --> 16:57:23,422 So in SQLite3 I now have a\ndifferent version of favorites.db. 17756 16:57:23,421 --> 16:57:25,311 I named it number 8 for consistency. 17757 16:57:25,311 --> 16:57:28,621 Once I've run the program I can\ndo .schema to look inside of it. 17758 16:57:28,622 --> 16:57:32,542 And here's what the two tables in\n 17759 16:57:32,542 --> 16:57:36,262 I've created a table called shows, this\n 17760 16:57:36,262 --> 16:57:39,752 that are favorites,\nthat has two columns. 17761 16:57:39,752 --> 16:57:42,154 One is called ID, one is called Title. 17762 16:57:42,154 --> 16:57:44,362 But now I'm going to start\ntaking out for a spin some 17763 16:57:44,362 --> 16:57:45,982 of the other features of SQL. 17764 16:57:45,982 --> 16:57:50,242 And besides there being text, it turns\n 17765 16:57:50,241 --> 16:57:52,252 Besides there being a\ndata type called text 17766 16:57:52,252 --> 16:57:55,491 there's also a special key\nphrase that you can specify 17767 16:57:55,491 --> 16:57:57,171 that the title can never be null. 17768 16:57:57,171 --> 16:58:00,502 Think back to our use\nof null in C. Think back 17769 16:58:00,502 --> 16:58:02,601 to the keyword none in Python. 17770 16:58:02,601 --> 16:58:06,171 This is a database constraint that\n 17771 16:58:06,171 --> 16:58:07,972 can't have of favorite TV show. 17772 16:58:07,972 --> 16:58:11,662 If you submit the form, you have\nto have typed in a title for it 17773 16:58:11,661 --> 16:58:13,591 to end up in our database here. 17774 16:58:13,591 --> 16:58:16,281 And you'll notice one other new feature. 17775 16:58:16,281 --> 16:58:18,801 It turns out, on this\ntable I'm defining what's 17776 16:58:18,802 --> 16:58:22,312 called a primary key,\nspecifically to be the ID column. 17777 16:58:22,311 --> 16:58:23,941 More on that in just a moment. 17778 16:58:23,942 --> 16:58:28,042 Meanwhile, the second table my code\n 17779 16:58:28,042 --> 16:58:33,561 gives me a column called\nshow ID, and then, a genre 17780 16:58:33,561 --> 16:58:36,481 the value of which is text\nthat can also not be null. 17781 16:58:36,482 --> 16:58:38,102 And then more on this in a moment. 17782 16:58:38,101 --> 16:58:41,182 This table has what we're\ngoing to call a foreign key 17783 16:58:41,182 --> 16:58:45,891 specifically the show ID column\nthat references shows ID. 17784 16:58:45,891 --> 16:58:48,591 So before we get into\nthe weeds of this, this 17785 16:58:48,591 --> 16:58:52,461 is now a way of creating the\nrelation in relational database. 17786 16:58:52,461 --> 16:58:56,481 If I have two tables now, not\njust one, they can somehow 17787 16:58:56,482 --> 16:59:00,062 be linked together by a common column. 17788 16:59:00,061 --> 16:59:03,201 In other words, the shows column-- 17789 16:59:03,201 --> 16:59:06,682 shows table is going to give\nme a table with two columns-- 17790 16:59:08,421 --> 16:59:12,141 Every title you gave me, I'm\ngoing to assign a unique value. 17791 16:59:12,141 --> 16:59:16,942 The genre's table, meanwhile, is\n 17792 16:59:16,942 --> 16:59:19,822 singular with that same idea. 17793 16:59:19,822 --> 16:59:26,491 And the result of this, to pop back to\n 17794 16:59:26,491 --> 16:59:30,801 Select star from shows\nof this new database 17795 16:59:30,802 --> 16:59:34,372 and you'll see that I've given,\n 17796 16:59:36,004 --> 16:59:39,171 I didn't filter out duplicates or do\n 17797 16:59:39,771 --> 16:59:42,271 So there's going to be some\nduplicates here because I didn't 17798 16:59:42,271 --> 16:59:44,182 want to get rid of anyone's data. 17799 16:59:44,182 --> 16:59:47,302 But you'll see that,\nindeed, I've given everyone 17800 16:59:47,302 --> 16:59:49,762 a unique identifier, from\nthe very first person who 17801 16:59:49,762 --> 16:59:53,992 typed How I Met Your Mother, all\n 17802 16:59:53,991 --> 17:00:01,101 Meanwhile, if I do select star from\n 17803 17:00:01,101 --> 17:00:03,621 a column in the original\ndata, now you'll 17804 17:00:03,622 --> 17:00:08,537 see a much better design for this data. 17805 17:00:08,536 --> 17:00:09,661 Notice what I've done here. 17806 17:00:09,661 --> 17:00:12,711 Let me go all the way to the top and\n 17807 17:00:12,711 --> 17:00:16,849 is called show ID, the other\nof which is called genre. 17808 17:00:16,849 --> 17:00:18,891 And again, I wrote some\ncode to do this because I 17809 17:00:18,891 --> 17:00:22,016 had to take Google's messy output where\n 17810 17:00:22,016 --> 17:00:25,671 I had to tear away the commas and\n 17811 17:00:27,112 --> 17:00:29,542 Even though we haven't\nintroduced the syntax via which 17812 17:00:29,542 --> 17:00:32,572 we can reconstitute the\ndata and reassociate 17813 17:00:32,572 --> 17:00:35,872 your genres with your\ntitles, why, at a glance 17814 17:00:35,872 --> 17:00:38,482 might this be a better design now? 17815 17:00:38,482 --> 17:00:42,082 Even though I've doubled the\nnumber of tables from one to two 17816 17:00:42,082 --> 17:00:46,882 why is this probably on the\ndirection toward a better design? 17817 17:00:46,881 --> 17:00:48,471 What might your instincts be? 17818 17:00:53,061 --> 17:00:56,572 Again, first time with SQL,\nwhy is it better, perhaps 17819 17:00:56,572 --> 17:00:59,122 that we've done this\nwith our genre's table? 17820 17:01:02,542 --> 17:01:06,702 Oh, just because we had the\n 17821 17:01:09,190 --> 17:01:14,472 We've cleaned up the data by giving\n 17822 17:01:14,472 --> 17:01:16,601 column in the original\nGoogle Spreadsheet 17823 17:01:16,601 --> 17:01:19,371 its own cell in this table, if you will. 17824 17:01:19,372 --> 17:01:22,272 And now notice show ID\nmight appear multiple times. 17825 17:01:22,271 --> 17:01:26,871 Whoever typed in How I Met Your Mother,\n 17826 17:01:26,872 --> 17:01:29,802 And so we see that\nshow ID 1 is a comedy. 17827 17:01:30,987 --> 17:01:32,862 I forget the name of\nthe second show offhand. 17828 17:01:32,862 --> 17:01:37,302 But that person, whoever was\nassigned show ID 2 checked off 17829 17:01:37,302 --> 17:01:39,192 a whole bunch of the genre's boxes. 17830 17:01:39,192 --> 17:01:42,882 That happened again with show ID 3, 4. 17831 17:01:42,881 --> 17:01:46,281 Persons 5, 6, 7 only checked one box. 17832 17:01:46,281 --> 17:01:50,381 And so you can see now that we've\n 17833 17:01:50,381 --> 17:01:53,031 might call a one-to-many relationship. 17834 17:01:53,031 --> 17:01:58,391 A one-to-many relationship, whereby\n 17835 17:01:58,391 --> 17:02:02,141 it can now have many genres\nassociated with it, each of which 17836 17:02:02,141 --> 17:02:06,591 is represented by a separate row here. 17837 17:02:06,591 --> 17:02:10,301 So again, if I go ahead and\nselect star from shows-- 17838 17:02:10,302 --> 17:02:14,082 let's limit it to the first 10 just\n 17839 17:02:14,082 --> 17:02:17,112 How I Met Your Mother, The Sopranos\nwas the second input there. 17840 17:02:17,112 --> 17:02:20,442 It would seem that now that I've\ncreated the data in this way 17841 17:02:20,442 --> 17:02:25,266 I could ideally somehow search the\n 17842 17:02:25,266 --> 17:02:26,891 I don't have to worry about the commas. 17843 17:02:26,891 --> 17:02:29,266 I don't have to worry about\nthe hackish approach of music 17844 17:02:29,266 --> 17:02:30,972 being a substring of musical. 17845 17:02:30,972 --> 17:02:33,652 But how can I actually\nget back at this data? 17846 17:02:33,652 --> 17:02:35,092 Well, let's go ahead and do this. 17847 17:02:35,091 --> 17:02:39,041 Suppose I did want to get back\nmaybe all of the comedies. 17848 17:02:39,042 --> 17:02:42,461 All of the comedies, no matter whether\n 17849 17:02:42,461 --> 17:02:44,891 box or multiple boxes instead. 17850 17:02:44,891 --> 17:02:48,671 How now, given that I\nhave two tables, could I 17851 17:02:48,671 --> 17:02:53,082 go about selecting only\nthe titles of comedies? 17852 17:02:53,082 --> 17:02:55,122 I've actually made the\nproblem a little harder 17853 17:02:55,122 --> 17:02:58,062 but again, SQL is going to\ngive me a solution for this. 17854 17:02:58,061 --> 17:03:00,371 The problem is that if I\nwant to search for comedies 17855 17:03:00,372 --> 17:03:03,162 I have to check the genres table first. 17856 17:03:03,161 --> 17:03:04,991 And then what's that going to give me? 17857 17:03:04,991 --> 17:03:08,921 If I search the genres\ntable for comedies 17858 17:03:08,921 --> 17:03:11,951 what's that going to\ngive me back potentially? 17859 17:03:13,451 --> 17:03:14,451 DAVID J. MALAN: Maybe show ID. 17860 17:03:15,243 --> 17:03:21,161 Let me do select show ID from genres,\n 17861 17:03:21,161 --> 17:03:22,661 equals quote, unquote, "comedy. 17862 17:03:22,661 --> 17:03:25,811 No commas, no like, no percent signs. 17863 17:03:25,811 --> 17:03:30,281 Because literally, that column now is\n 17864 17:03:31,061 --> 17:03:33,011 Let me go ahead and hit Enter here. 17865 17:03:33,012 --> 17:03:36,042 OK, so I got back a whole\nbunch of ID numbers. 17866 17:03:36,042 --> 17:03:38,442 Now this could very\nquickly get annoying. 17867 17:03:38,442 --> 17:03:43,272 It looks like show ID 1, 2, 4, 5, 6,\n 17868 17:03:43,271 --> 17:03:49,031 So I could do something really\n 17869 17:03:49,031 --> 17:03:54,731 where ID equals 1, or ID equals 2. 17870 17:03:54,732 --> 17:03:57,702 This is not going to\nscale very well, but this 17871 17:03:57,701 --> 17:03:59,951 is why SQL is especially powerful. 17872 17:03:59,951 --> 17:04:04,761 You can actually compose one\nSQL question from multiple ones. 17873 17:04:05,781 --> 17:04:09,822 Why don't I select the title\nwhere the ID of the show 17874 17:04:09,822 --> 17:04:13,391 is in the following list of IDs? 17875 17:04:13,391 --> 17:04:20,052 Select show ID from genres, where the\n 17876 17:04:20,982 --> 17:04:23,892 So I've got two SQL queries. 17877 17:04:23,891 --> 17:04:27,192 One is deliberately nested\ninside of parentheses. 17878 17:04:27,192 --> 17:04:30,102 That's going to give me back\nthat whole list of show IDs. 17879 17:04:30,101 --> 17:04:32,411 But that's exactly what\nI want to then look up 17880 17:04:32,411 --> 17:04:36,341 the titles for by selecting title\n 17881 17:04:38,722 --> 17:04:44,052 And so now if I hit Enter,\nI get back only those shows 17882 17:04:44,052 --> 17:04:47,862 that were somehow flagged as\ncomedy, whether you in the audience 17883 17:04:47,862 --> 17:04:51,912 checked one box for comedy,\ntwo boxes, or all of the boxes. 17884 17:04:51,911 --> 17:04:54,269 Somehow we teased out\ncomedy, again, just 17885 17:04:54,269 --> 17:04:56,561 by using that Python script,\nwhich loaded this data not 17886 17:04:56,561 --> 17:04:59,141 into one big table, but instead, two. 17887 17:04:59,141 --> 17:05:01,762 And if we want to clean this\nup, let's do a couple of things. 17888 17:05:01,762 --> 17:05:05,922 Let's, outside of the\nparentheses, do order by title. 17889 17:05:05,921 --> 17:05:08,891 This is a way of sorting\nthe data in SQL very easily. 17890 17:05:08,891 --> 17:05:13,481 Now we have a whole list of the\nsame titles that are now sorted. 17891 17:05:13,482 --> 17:05:17,982 And what was the keyword with which\n 17892 17:05:19,601 --> 17:05:24,941 Same query, but let's select only the\n 17893 17:05:24,942 --> 17:05:27,112 And notice, I've very\ndeliberately done it this way. 17894 17:05:27,112 --> 17:05:28,862 And to this day, any\ntime I'm using SQL, I 17895 17:05:28,862 --> 17:05:31,529 don't just start at the beginning\nand type out my whole thought 17896 17:05:31,529 --> 17:05:33,281 and just get it right on the first try. 17897 17:05:33,281 --> 17:05:35,951 I very commonly start\nwith the subquery, if you 17898 17:05:35,951 --> 17:05:38,141 will, the thing in\nparentheses, just to get myself 17899 17:05:38,141 --> 17:05:39,942 one step toward what I care about. 17900 17:05:41,381 --> 17:05:43,932 Then I add to it, just like\nwe've encouraged in Python and C 17901 17:05:43,932 --> 17:05:47,711 taking baby steps in order to get to\n 17902 17:05:48,822 --> 17:05:51,402 And other than this\nmistake, which we didn't 17903 17:05:51,402 --> 17:05:55,692 fix because I re-imported the data after\n 17904 17:05:55,692 --> 17:06:00,531 we now have an alphabetized\nlist of all of the same data. 17905 17:06:00,531 --> 17:06:06,012 But now it's better designed, because we\n 17906 17:06:10,061 --> 17:06:14,292 What questions do we have, if any here? 17907 17:06:25,622 --> 17:06:27,622 DAVID J. MALAN: Oh, now\nthat we have a database 17908 17:06:27,622 --> 17:06:29,982 how do we transfer it to a CSV? 17909 17:06:31,402 --> 17:06:33,912 And in fact, there's a\ncommand within SQLite 17910 17:06:33,911 --> 17:06:36,931 that allows you to export\nyour data back to a CSV file. 17911 17:06:36,932 --> 17:06:38,682 If you want to email\nit to someone and you 17912 17:06:38,682 --> 17:06:41,771 want them to be able to open it in\n 17913 17:06:41,771 --> 17:06:44,351 Numbers, or the like, you can\ngo in the other direction. 17914 17:06:44,351 --> 17:06:47,231 Generally though, once\nyou're in the world of SQL 17915 17:06:47,232 --> 17:06:49,962 you're probably storing\nyour data there long term. 17916 17:06:49,961 --> 17:06:52,932 And you're probably updating it,\n 17917 17:06:53,614 --> 17:06:55,781 For instance, the one command\nI did not show earlier 17918 17:06:55,781 --> 17:06:58,661 is, suppose someone forgot a show. 17919 17:06:58,661 --> 17:07:00,981 Let's see, did I see this in the output? 17920 17:07:00,982 --> 17:07:02,922 All right, so Curb Your Enthusiasm. 17921 17:07:04,872 --> 17:07:06,502 Did anyone see it last night? 17922 17:07:07,002 --> 17:07:10,084 All right, well, just the one person\n 17923 17:07:10,084 --> 17:07:12,082 What's another show that\ndidn't make the list? 17924 17:07:13,559 --> 17:07:14,891 It's now on Netflix, apparently. 17925 17:07:21,732 --> 17:07:25,252 Well, we want to insert\nmaybe an ID and a title. 17926 17:07:25,252 --> 17:07:27,491 But I don't actually\ncare what the ID is 17927 17:07:27,491 --> 17:07:28,991 so I'm just going to insert a title. 17928 17:07:28,991 --> 17:07:31,121 And the value I'm going\nto give to that title 17929 17:07:31,122 --> 17:07:34,242 is going to be, quote,\nunquote, "Seinfeld. 17930 17:07:34,241 --> 17:07:37,152 And then, let me go\nahead and hit semicolon. 17931 17:07:37,152 --> 17:07:39,702 Nothing seems to happen, but\nlet me rerun the big query 17932 17:07:39,701 --> 17:07:41,591 from before looking for comedies. 17933 17:07:41,591 --> 17:07:45,191 And unfortunately, Seinfeld has\n 17934 17:07:45,192 --> 17:07:47,052 so let's get this right, too. 17935 17:07:47,052 --> 17:07:50,712 What intuitively I'm going to\nhave to do to associate, now 17936 17:07:53,322 --> 17:07:55,482 I just inserted into the show's table. 17937 17:07:55,482 --> 17:07:59,232 What more needs to happen before\n 17938 17:08:03,851 --> 17:08:08,292 So I need to insert into the\ngenres table two things now 17939 17:08:08,292 --> 17:08:13,522 a show ID, like this, and\nthen, the name of the genre 17940 17:08:14,682 --> 17:08:16,152 What values do I want to insert? 17941 17:08:16,152 --> 17:08:18,137 Well, the show ID, I better grab that. 17942 17:08:18,137 --> 17:08:19,512 Oh, I don't even know what it is. 17943 17:08:19,512 --> 17:08:21,112 I'm going to have to\nfigure out what that is. 17944 17:08:21,112 --> 17:08:23,002 So I could do this in a couple of ways. 17945 17:08:24,491 --> 17:08:28,265 Select star from shows,\nwhere title equals 17946 17:08:28,266 --> 17:08:32,112 quote, unquote,\n"Seinfeld" semicolon 159. 17947 17:08:32,112 --> 17:08:37,122 So now I could do, insert\ninto genres a show ID 17948 17:08:37,122 --> 17:08:45,852 and a genre name, the values 159, and,\n 17949 17:08:46,601 --> 17:08:50,051 And now, if I scroll back in my history\n 17950 17:08:50,052 --> 17:08:52,032 again, looking for\nall distinct comedies 17951 17:08:52,031 --> 17:08:54,442 now Seinfeld has made the list. 17952 17:08:54,442 --> 17:08:57,978 But I did this manually so I\ndidn't actually capitalize it. 17953 17:09:03,042 --> 17:09:08,781 Set title equals to Seinfeld semicolon. 17954 17:09:09,281 --> 17:09:13,121 OK, thank you, where title equals,\nquote, unquote, "Seinfeld. 17955 17:09:13,122 --> 17:09:14,862 Let's not make that mistake again. 17956 17:09:15,671 --> 17:09:18,612 And now, if I execute that really\nbig query, now Seinfeld is 17957 17:09:18,612 --> 17:09:21,822 indeed, considered a comedy. 17958 17:09:21,822 --> 17:09:23,302 So where are we going with this? 17959 17:09:23,302 --> 17:09:25,582 Well, thus far we've been doing\nall of this pretty manually. 17960 17:09:25,582 --> 17:09:28,122 And this is absolutely what an\n 17961 17:09:28,122 --> 17:09:30,539 might do if just manipulating\na pretty large data set just 17962 17:09:30,538 --> 17:09:33,351 to get at interesting answers\nthat might be across one 17963 17:09:33,351 --> 17:09:34,781 two, or even many more tables. 17964 17:09:34,781 --> 17:09:37,781 Eventually, in a few weeks, we're\n 17965 17:09:37,781 --> 17:09:41,951 by writing code in Python\nthat generates SQL to do this. 17966 17:09:41,951 --> 17:09:44,621 If you go to most any website\non the internet today 17967 17:09:44,622 --> 17:09:48,762 and you, for instance, log in, odds are\n 17968 17:09:50,771 --> 17:09:53,351 Well, the website might not\nbe implemented in Python 17969 17:09:53,351 --> 17:09:56,951 but it's probably implemented in some\n 17970 17:09:58,451 --> 17:10:03,671 And that language is probably using\n 17971 17:10:03,671 --> 17:10:07,271 to use SQL to get your\nusername, get your password 17972 17:10:07,271 --> 17:10:09,338 and compare the two against\nwhat you've typed in. 17973 17:10:09,338 --> 17:10:11,921 And actually, it's hopefully not\ngetting your actual password 17974 17:10:11,921 --> 17:10:13,504 but something called the hash thereof. 17975 17:10:13,504 --> 17:10:15,701 But there's probably a\ndatabase involved doing that. 17976 17:10:15,701 --> 17:10:18,941 When you buy something on\nAmazon.com and you click Check Out 17977 17:10:18,942 --> 17:10:22,062 odds are there's some\ncode on Amazon's server 17978 17:10:22,061 --> 17:10:25,211 that's looking at what it is\nyou added to your shopping cart 17979 17:10:25,211 --> 17:10:29,031 and then maybe using a for loop of some\n 17980 17:10:29,031 --> 17:10:33,521 It's doing a whole bunch of SQL\n 17981 17:10:34,631 --> 17:10:37,601 There's other types of databases,\ntoo, but SQL databases 17982 17:10:37,601 --> 17:10:39,989 or relational databases\nare quite popular. 17983 17:10:39,989 --> 17:10:42,072 So let's go ahead and write\none other program here 17984 17:10:42,072 --> 17:10:46,661 in Python that now merges these\ntwo languages together, whereby 17985 17:10:46,661 --> 17:10:50,081 I'm going to use SQL\ninside of a Python program 17986 17:10:50,082 --> 17:10:53,772 so I can implement my logic\nof my program in Python 17987 17:10:55,391 --> 17:10:59,601 But when I want to get at some data I\n 17988 17:10:59,601 --> 17:11:02,531 So let me go ahead\nand open favorites.py. 17989 17:11:05,082 --> 17:11:10,542 And let me go ahead and throw away\n 17990 17:11:10,542 --> 17:11:13,031 just now add a SQL to the mix. 17991 17:11:13,031 --> 17:11:16,781 From the CS50 library, let's\nimport the SQL function. 17992 17:11:16,781 --> 17:11:19,601 This will be useful to use\nbecause most third-party libraries 17993 17:11:19,601 --> 17:11:22,731 that deal with SQL and Python are\n 17994 17:11:22,732 --> 17:11:25,732 So I think you'll find\nthis library easier to use. 17995 17:11:25,732 --> 17:11:27,122 Let's then do the following. 17996 17:11:27,122 --> 17:11:29,242 Create a variable\ncalled db for database. 17997 17:11:29,241 --> 17:11:30,741 But I could call it anything I want. 17998 17:11:30,741 --> 17:11:34,432 Let's use that you URI, which is\na fancy way of saying something 17999 17:11:34,432 --> 17:11:43,131 that looks like a URL, but that actually\n 18000 17:11:44,451 --> 17:11:47,961 Let's now ask the user for a title by\n 18001 17:11:49,161 --> 17:11:53,301 And let's strip off any whitespace\n 18002 17:11:53,302 --> 17:11:56,072 And then, let's go ahead and do this. 18003 17:11:57,561 --> 17:12:01,911 I'm going to go ahead now and write\n 18004 17:12:01,911 --> 17:12:05,181 to talk to the original favorites.db. 18005 17:12:05,182 --> 17:12:09,322 So again, I'm not using the two-table\n 18006 17:12:09,322 --> 17:12:12,661 I'm using the original that we\nimported from your own data 18007 17:12:12,661 --> 17:12:14,431 and I'm going to do the following. 18008 17:12:14,432 --> 17:12:19,491 I'm going to use db.execute to execute\n 18009 17:12:19,491 --> 17:12:28,042 I'm going to select the count\nof shows from the favorites 18010 17:12:28,042 --> 17:12:35,302 table, where the title the user\n 18011 17:12:35,302 --> 17:12:37,252 And why I'm doing that is as follows. 18012 17:12:37,252 --> 17:12:40,942 Just like in C, when we had\npercent S, in SQL for now 18013 17:12:40,942 --> 17:12:42,832 the analogue is going\nto be a question mark. 18014 17:12:42,832 --> 17:12:44,332 So same idea, different syntax. 18015 17:12:44,332 --> 17:12:46,522 Instead of percent S,\nit's just a question mark. 18016 17:12:46,521 --> 17:12:51,621 And using a comma outside of this\n 18017 17:12:51,622 --> 17:12:54,682 function I can pass in\na SQL string, a command 18018 17:12:54,682 --> 17:12:59,177 then any arguments I want to plug\n 18019 17:12:59,177 --> 17:13:01,552 So the goal at hand is to\nactually write a program that's 18020 17:13:01,552 --> 17:13:07,762 going to search favorites.csv, a.k.a.,\n 18021 17:13:07,762 --> 17:13:10,641 of people that liked a particular show. 18022 17:13:10,641 --> 17:13:14,391 So this is going to select the count\n 18023 17:13:14,391 --> 17:13:18,741 where the title they typed in is like\n 18024 17:13:19,311 --> 17:13:22,072 This db execute function returns a list. 18025 17:13:23,154 --> 17:13:25,529 And you would only know that\nby my telling you or reading 18026 17:13:26,631 --> 17:13:29,481 And therefore, if I want to\nget back to the total count 18027 17:13:29,482 --> 17:13:34,282 I'm going to go ahead and grab\nthe first row from those rows. 18028 17:13:34,281 --> 17:13:36,561 Because it's only going\nto give me back the count. 18029 17:13:36,561 --> 17:13:41,781 And then I'm going to go ahead and\n 18030 17:13:41,781 --> 17:13:43,281 But it's going to be a little weird. 18031 17:13:43,281 --> 17:13:46,762 Technically the column is going to be\n 18032 17:13:47,762 --> 17:13:49,522 Let me add one more feature to the mix. 18033 17:13:49,521 --> 17:13:51,621 You can actually give\nnicknames to columns 18034 17:13:51,622 --> 17:13:55,432 that are coming back, especially if they\n 18035 17:13:55,432 --> 17:13:59,512 I can just call that column\ncounter, in all lowercase. 18036 17:13:59,512 --> 17:14:06,961 That means I can now say get back the\n 18037 17:14:06,961 --> 17:14:08,701 So just to recap, what have we done? 18038 17:14:08,701 --> 17:14:11,351 We've imported the CS50\nlibrary SQL function. 18039 17:14:11,351 --> 17:14:14,421 We've, with this line of\ncode, opened the favorites.db 18040 17:14:14,421 --> 17:14:20,152 file that you and I created earlier\n 18041 17:14:20,152 --> 17:14:23,482 I'm now just asking the user for\n 18042 17:14:23,482 --> 17:14:27,412 I'm now executing this SQL\nquery on that database 18043 17:14:27,411 --> 17:14:30,591 plugging in whatever the\nhuman typed in as their title 18044 17:14:30,591 --> 17:14:32,511 in order to get back a total count. 18045 17:14:32,512 --> 17:14:36,292 And I'm giving the count a\nnickname, an alias of counter 18046 17:14:36,292 --> 17:14:39,202 just so it's more self-explanatory. 18047 17:14:39,201 --> 17:14:43,671 This function, db execute, no matter\n 18048 17:14:43,671 --> 17:14:45,811 even if there's only\none row inside of it. 18049 17:14:45,811 --> 17:14:48,651 So this line of code just gives\nme the first and only row. 18050 17:14:48,652 --> 17:14:53,302 And then, this goes inside of that row,\n 18051 17:14:53,302 --> 17:14:59,072 and gives me the key counter\nand the value it corresponds to. 18052 17:14:59,072 --> 17:15:00,682 So what, to be clear, is this doing? 18053 17:15:00,682 --> 17:15:03,561 Let's go ahead and run this manually\n 18054 17:15:03,561 --> 17:15:07,311 Let me run SQLite3 on favorites-- 18055 17:15:08,722 --> 17:15:12,752 On favorites.db, let me\nimport the data again. 18056 17:15:12,752 --> 17:15:20,252 So mode csv.import in from\nfavorites.csv into a favorites table. 18057 17:15:20,252 --> 17:15:22,671 So I've just recreated the\nsame data set that you all 18058 17:15:22,671 --> 17:15:25,123 gave me earlier in favorites.db. 18059 17:15:25,124 --> 17:15:27,832 If I were to do this manually,\n 18060 17:15:27,832 --> 17:15:34,552 Select, count star from favorites,\nwhere title like, and let's 18061 17:15:34,552 --> 17:15:37,612 just manually type it\nin for now, The Office. 18062 17:15:37,612 --> 17:15:40,671 We'll search for the one\nwith the word The, semicolon. 18063 17:15:41,902 --> 17:15:44,122 But technically, notice what I get back. 18064 17:15:44,122 --> 17:15:50,422 I technically get back a miniature\n 18065 17:15:50,421 --> 17:15:52,432 What if I want to rename that column? 18066 17:15:52,432 --> 17:15:54,182 That's where the as keyword comes in. 18067 17:15:54,182 --> 17:15:56,421 So select count star as counter. 18068 17:15:58,252 --> 17:16:01,224 I just get back-- same\nsimple table, but I've 18069 17:16:01,224 --> 17:16:03,891 renamed the column to be counter\njust because it's a little more 18070 17:16:03,891 --> 17:16:05,752 self-explanatory as to what it is. 18071 17:16:05,752 --> 17:16:08,542 So what am I doing\nwith this line of code? 18072 17:16:08,542 --> 17:16:12,772 This line of code is returning to\n 18073 17:16:12,771 --> 17:16:16,161 in the form of a list of dictionaries. 18074 17:16:16,161 --> 17:16:20,961 The list contains one\nrow, as we'll see, and it 18075 17:16:20,961 --> 17:16:26,141 contains one column, as we'll\nsee, the key for which is counter. 18076 17:16:26,141 --> 17:16:27,881 So let's now run the code itself. 18077 17:16:27,881 --> 17:16:32,621 I'm going to get out of SQLite3 and I'm\n 18078 17:16:33,461 --> 17:16:34,881 I'm being prompted for a title. 18079 17:16:34,881 --> 17:16:39,131 I'm going to type in The Office and\n 18080 17:16:40,152 --> 17:16:42,792 Well, there's a typo again\nbecause I re-imported the CSV. 18081 17:16:42,792 --> 17:16:46,612 I had deleted two of the Thes, so\n 18082 17:16:46,612 --> 17:16:51,101 So there\'s 12 total that have,\nquote, unquote, "The Office 18083 17:16:54,201 --> 17:16:57,161 We've combined some\nPython with some SQL 18084 17:16:57,161 --> 17:17:00,131 but we've relegated all of the\n 18085 17:17:00,131 --> 17:17:02,141 the selecting of something,\ngotten rid of all 18086 17:17:02,141 --> 17:17:04,902 of the with keyword, the\nopen keyword, the for loop 18087 17:17:04,902 --> 17:17:06,942 the reader the DictReader,\nand all of that. 18088 17:17:06,942 --> 17:17:11,802 And it's just one line of SQL now,\n 18089 17:17:11,802 --> 17:17:17,062 All right, any questions on what we've\n 18090 17:17:26,411 --> 17:17:29,754 DAVID J. MALAN: When does this\n 18091 17:17:31,372 --> 17:17:33,582 So let's do that by changing\nthe problem at hand. 18092 17:17:33,582 --> 17:17:36,312 This program was designed just\nto select the total count. 18093 17:17:36,311 --> 17:17:41,411 Let's go ahead and\nselect, for instance, all 18094 17:17:41,411 --> 17:17:46,046 of the ways you all typed in The Office\n 18095 17:17:49,451 --> 17:17:53,811 If I do this in SQLite3, let\nme go ahead and do this again 18096 17:17:53,811 --> 17:17:55,311 after increasing my Terminal window. 18097 17:17:56,262 --> 17:18:00,912 Select title from favorites,\nwhere the title is like 18098 17:18:00,911 --> 17:18:04,176 quote, unquote, "The Office," semicolon. 18099 17:18:04,177 --> 17:18:07,302 I get back all of these different rows,\n 18100 17:18:07,302 --> 17:18:09,252 There's actually another\nlittle typo in there 18101 17:18:09,252 --> 17:18:12,972 with some capitalization of the\nE, and the C, and the E. That 18102 17:18:12,972 --> 17:18:16,182 would be an example of a query\nthat gives me back therefore 18103 17:18:17,421 --> 17:18:19,332 So let's now change my Python program. 18104 17:18:19,332 --> 17:18:24,882 If I now, in my Python program, do\n 18105 17:18:24,881 --> 17:18:26,451 containing all of those titles. 18106 17:18:26,451 --> 17:18:31,691 I can now do, for row in rows, I can\n 18107 17:18:31,692 --> 17:18:34,732 and now manipulate all\nof those things together. 18108 17:18:34,732 --> 17:18:36,102 Let me keep both on the screen. 18109 17:18:36,101 --> 17:18:37,661 Let me run Python of favorites.py. 18110 17:18:37,661 --> 17:18:41,661 And that for loop now should\niterate, what, 10 or more times 18111 17:18:41,661 --> 17:18:43,361 once for each of those titles. 18112 17:18:43,362 --> 17:18:47,351 And indeed, if I type in\nThe Office again, Enter. 18113 17:18:52,091 --> 17:18:55,394 Oh, I should not be renaming\ntitle to counter this time. 18114 17:18:55,394 --> 17:18:57,101 So that's just a dumb\nmistake on my part. 18115 17:18:58,752 --> 17:19:01,572 And now I should see after\ntyping in The Office 18116 17:19:01,572 --> 17:19:03,762 Enter, a whole bunch of The Offices. 18117 17:19:03,762 --> 17:19:05,862 And because I'm using\nlike, even the missed 18118 17:19:05,862 --> 17:19:08,891 capitalizations are coming through,\n 18119 17:19:08,891 --> 17:19:11,231 It doesn't matter if it's\nuppercase or lowercase. 18120 17:19:11,232 --> 17:19:15,642 Whereas had I used the equal sign\n 18121 17:19:17,277 --> 17:19:20,592 All right, any questions on this next? 18122 17:19:20,591 --> 17:19:25,041 All right, so let's transition\nto a larger, juicier data 18123 17:19:25,042 --> 17:19:26,952 set, and consider some\nof the issues that 18124 17:19:26,951 --> 17:19:31,211 arise when actually now using SQL and\n 18125 17:19:31,211 --> 17:19:34,311 using SQL for mobile apps, web\napps, and generally speaking 18126 17:19:36,141 --> 17:19:39,432 So let's start with a larger\ndata set just like that. 18127 17:19:39,432 --> 17:19:45,141 Give me just a moment to switch screens\n 18128 17:19:45,141 --> 17:19:48,311 which is an actual relational\ndatabase that we've created out 18129 17:19:48,311 --> 17:19:51,881 of a real-world data set from IMDb. 18130 17:19:51,881 --> 17:19:54,551 So InternetMovieDatabase.com\nis a website 18131 17:19:54,552 --> 17:19:57,132 where you can search for TV\nshows, and movies, and actors 18132 17:19:57,131 --> 17:20:00,221 and so forth, all using their\ndatabase behind the scenes. 18133 17:20:00,222 --> 17:20:04,872 IMDb wonderfully makes their data\n 18134 17:20:04,872 --> 17:20:08,302 but TSV files, tab-separated values. 18135 17:20:08,302 --> 17:20:11,802 And so what we did is, before class\n 18136 17:20:11,802 --> 17:20:15,641 We wrote a Python program\nsimilar to my favorites8.py file 18137 17:20:15,641 --> 17:20:19,091 earlier that read in\nall of those TSV files 18138 17:20:19,091 --> 17:20:24,161 created some SQL tables\nin an IMDb database 18139 17:20:24,161 --> 17:20:28,611 for you in SQLite that has multiple\ntables and multiple columns. 18140 17:20:28,612 --> 17:20:32,531 So let's go and wrap our minds around\n 18141 17:20:32,531 --> 17:20:36,281 Let me go back to VS Code\nhere, and in just a moment 18142 17:20:36,281 --> 17:20:40,601 I'm going to go ahead and copy the\n 18143 17:20:40,601 --> 17:20:45,851 And I'm going to go ahead and increase\n 18144 17:20:45,851 --> 17:20:48,911 Whenever playing around with a\n 18145 17:20:48,911 --> 17:20:51,822 typing .schema is perhaps a good\n 18146 17:20:52,752 --> 17:20:54,461 And things just escalated quickly. 18147 17:20:54,461 --> 17:20:56,981 There's a lot in this data\nset, because, indeed, there's 18148 17:20:56,982 --> 17:21:01,092 going to be tens of hundreds of\n 18149 17:21:01,091 --> 17:21:04,572 and also problem set 7, where we'll\n 18150 17:21:06,262 --> 17:21:09,281 So what is the schema that\nwe have created for you 18151 17:21:09,281 --> 17:21:12,491 from IMDb's actual real-world data? 18152 17:21:12,491 --> 17:21:14,292 One, there's a table called shows. 18153 17:21:14,292 --> 17:21:17,292 And notice we've just added whitespace\n 18154 17:21:17,292 --> 17:21:19,391 to make it a little more\nstylistically readable. 18155 17:21:19,391 --> 17:21:23,322 The shows table has an ID\ncolumn, a title column, a year 18156 17:21:23,322 --> 17:21:26,082 and the total number of\nepisodes for a given show. 18157 17:21:26,082 --> 17:21:31,092 And the types of those columns are\n 18158 17:21:31,091 --> 17:21:33,431 So it turns out there's\nactually a few different data 18159 17:21:33,432 --> 17:21:39,192 types that are worth being aware of when\n 18160 17:21:39,192 --> 17:21:43,512 In fact, in SQLite there's\nfive data types, and only five 18161 17:21:43,512 --> 17:21:46,992 fortunately, one of which is, indeed,\n 18162 17:21:46,991 --> 17:21:50,351 numeric, which is kind of a\ncatchall for dates and times 18163 17:21:50,351 --> 17:21:52,661 things that are numeric\nbut are not just integers 18164 17:21:52,661 --> 17:21:54,851 and not just real numbers, for instance. 18165 17:21:54,851 --> 17:21:58,362 Real number is what we've generally\n 18166 17:21:58,362 --> 17:22:00,400 Text, of course, is\njust text, but notice 18167 17:22:00,400 --> 17:22:02,442 that you don't have to\nworry about how big it is. 18168 17:22:02,442 --> 17:22:04,452 Like in Python, it will size to fit. 18169 17:22:04,451 --> 17:22:07,182 And then there's BLOB, which\nis binary large object, which 18170 17:22:07,182 --> 17:22:10,641 is for just raw 0s and 1s, like\nfor files or things like that. 18171 17:22:10,641 --> 17:22:12,911 But we'll generally use\nthe other four of these. 18172 17:22:12,911 --> 17:22:16,301 And so, indeed, when we\nimported this data for you 18173 17:22:16,302 --> 17:22:21,612 we decided that every show would be\n 18174 17:22:21,612 --> 17:22:24,802 Every show has, of course, a\ntitle, which should not be null. 18175 17:22:24,802 --> 17:22:26,662 Otherwise, why is it in the database? 18176 17:22:26,661 --> 17:22:30,171 Every show has a year,\nwhich is numeric according 18177 17:22:30,171 --> 17:22:31,521 to that definition a moment ago. 18178 17:22:31,521 --> 17:22:34,881 And the total number of episodes for\n 18179 17:22:34,881 --> 17:22:38,451 What now is with these primary keys\n 18180 17:22:38,451 --> 17:22:43,432 A primary key is the column that\n 18181 17:22:43,432 --> 17:22:46,461 In our case, with the\nfavorites, I automatically 18182 17:22:46,461 --> 17:22:50,091 gave each of your submissions a unique\n 18183 17:22:50,091 --> 17:22:52,701 typed in The Office,\nyour submission still 18184 17:22:52,701 --> 17:22:57,651 had a unique identifier, a number\n 18185 17:22:57,652 --> 17:23:01,671 with your genres, just\nas we saw a moment ago. 18186 17:23:01,671 --> 17:23:04,621 In this version of IMDb,\nthere's also genres. 18187 17:23:04,622 --> 17:23:07,461 But they don't come from\nus, they come from IMDb.com. 18188 17:23:07,461 --> 17:23:11,661 And so a genre has a show ID, and\n 18189 17:23:11,661 --> 17:23:15,231 But these are real-world genres\nwith a bit more filtration. 18190 17:23:15,232 --> 17:23:19,862 Notice, though, just like my\nversion, there's a foreign key. 18191 17:23:19,862 --> 17:23:25,311 A foreign key is the appearance\nof another table's primary key 18192 17:23:27,391 --> 17:23:30,442 So when you have a table\nlike genres, which is somehow 18193 17:23:30,442 --> 17:23:36,262 cross referencing the original shows\n 18194 17:23:36,262 --> 17:23:40,612 called ID, and those same numbers\nappear in the genres table 18195 17:23:40,612 --> 17:23:45,592 under the column called show ID, by\n 18196 17:23:45,591 --> 17:23:47,991 It's the same numbers but\nit's foreign in the sense 18197 17:23:47,991 --> 17:23:50,601 that the number is being\nused in this table 18198 17:23:50,601 --> 17:23:54,472 even though it's officially defined\n 18199 17:23:54,472 --> 17:23:57,112 This is what we mean by\nrelational databases. 18200 17:23:57,112 --> 17:24:02,512 You have multiple tables with some\n 18201 17:24:02,512 --> 17:24:06,202 And those numbers allow you to line\n 18202 17:24:06,201 --> 17:24:09,381 that you can reconnect the\nshows with their genres 18203 17:24:09,381 --> 17:24:12,141 just like we did with our\nsmaller data set a moment ago. 18204 17:24:12,141 --> 17:24:14,391 This logic is extended further. 18205 17:24:14,391 --> 17:24:18,411 Notice that the IMDb database we've\n 18206 17:24:18,411 --> 17:24:22,072 like TV show stars, the actors therein. 18207 17:24:22,072 --> 17:24:25,552 And that table, interestingly,\nhas no mention of people 18208 17:24:25,552 --> 17:24:27,562 and no mention of shows, per se. 18209 17:24:27,561 --> 17:24:31,072 It only has a column called\nshow ID, which is an integer 18210 17:24:31,072 --> 17:24:33,561 and a person ID, which is an integer. 18211 17:24:33,561 --> 17:24:39,661 Meanwhile, if we scrolled\ndown to the bottom 18212 17:24:39,661 --> 17:24:42,831 you will see a table called people. 18213 17:24:42,832 --> 17:24:48,351 And we have decided in IMDb's world\n 18214 17:24:48,351 --> 17:24:52,851 will have a unique identifier that's\n 18215 17:24:52,851 --> 17:24:56,841 date, which is numeric, and\nthen, again, specifying that ID 18216 17:24:56,841 --> 17:25:00,691 is going to be their primary key. 18217 17:25:02,281 --> 17:25:07,981 Well, it turns out that TV stars and\n 18218 17:25:07,982 --> 17:25:13,072 So using this relational database,\n 18219 17:25:13,072 --> 17:25:15,112 We're factoring out commonalities. 18220 17:25:15,112 --> 17:25:17,912 And if a person can be\ndifferent things in life 18221 17:25:17,911 --> 17:25:20,601 well, we're defining them\nfirst and foremost as people. 18222 17:25:20,601 --> 17:25:23,491 And then, notice these two\ntables are almost the same. 18223 17:25:23,491 --> 17:25:26,002 The stars table has a show\nID, which is a number 18224 17:25:26,002 --> 17:25:28,012 and a person ID, which\nis a number, which 18225 17:25:28,012 --> 17:25:36,052 allows us via this middleman table, if\n 18226 17:25:36,052 --> 17:25:41,422 Similarly, the writers table allows\n 18227 17:25:41,421 --> 17:25:43,561 by just recording those numbers. 18228 17:25:43,561 --> 17:25:46,322 So if we go into this data\nset, let's do the following. 18229 17:25:46,322 --> 17:25:49,682 Let's do select star\nfrom people semicolon. 18230 17:25:49,682 --> 17:25:52,372 So a huge amount of data is coming back. 18231 17:25:52,372 --> 17:25:56,822 This is hundreds of thousands of rows\n 18232 17:25:56,822 --> 17:25:59,661 So this is real-world data\nnow flying across the screen. 18233 17:25:59,661 --> 17:26:03,501 There's a lot of people in the TV show\n 18234 17:26:06,021 --> 17:26:07,318 There's a lot of data there. 18235 17:26:07,319 --> 17:26:09,652 So my god, if you had to do\nanything manual in this data 18236 17:26:09,652 --> 17:26:12,002 set it's probably not going\nto work out very well. 18237 17:26:12,002 --> 17:26:14,932 And actually, we're up to, what,\na million people in this data 18238 17:26:14,932 --> 17:26:17,332 set, plus, which would mean\nthis probably isn't even 18239 17:26:17,332 --> 17:26:20,932 going to open very well in Excel, or\n 18240 17:26:20,932 --> 17:26:23,042 SQL probably is the\nbetter approach here. 18241 17:26:23,042 --> 17:26:25,702 Let's search for someone\nspecific, like select star 18242 17:26:25,701 --> 17:26:31,401 from people, where name equals\n 18243 17:26:32,044 --> 17:26:33,502 All right, so there's Steve Carell. 18244 17:26:33,502 --> 17:26:39,442 He is person number\n136,797, born in 1962. 18245 17:26:39,442 --> 17:26:41,882 And that's as much data as\nwe have on Steve Carell here. 18246 17:26:41,881 --> 17:26:44,551 How do we figure out what\nshows, for instance, he's in? 18247 17:26:44,552 --> 17:26:48,842 Well, let's see, select\nstar from shows, semicolon. 18248 17:26:48,841 --> 17:26:52,521 There's a crazy number of shows\nout there in the IMDb database. 18249 17:26:52,521 --> 17:26:55,491 And you can see it here again\nflying across the screen. 18250 17:26:55,491 --> 17:26:58,972 Feels like we're going to have to\n 18251 17:26:58,972 --> 17:27:02,402 to get at all of Steve Carell's shows. 18252 17:27:02,402 --> 17:27:04,562 So how are we going to do that? 18253 17:27:04,561 --> 17:27:07,072 Well, god, this is a lot of data here. 18254 17:27:07,072 --> 17:27:10,461 And in fact, yeah, we\nhave, what, 15 million 18255 17:27:10,461 --> 17:27:12,622 shows plus in this data set, too. 18256 17:27:12,622 --> 17:27:15,682 So doing things efficiently is\nnow going to start to matter. 18257 17:27:17,131 --> 17:27:18,801 Let me select a specific show. 18258 17:27:18,802 --> 17:27:23,932 Select star from shows where title\n 18259 17:27:23,932 --> 17:27:26,262 And there presumably shouldn't\nbe typos in this data 18260 17:27:26,262 --> 17:27:28,812 because it comes from the\nreal website IMDb.com. 18261 17:27:30,341 --> 17:27:33,551 Turns out there's been a lot of\nThe Offices out in the world. 18262 17:27:33,552 --> 17:27:37,512 The one that started in 2005\nis the one that we want 18263 17:27:37,512 --> 17:27:40,332 presumably the most\npopular with 188 episodes. 18264 17:27:41,561 --> 17:27:46,991 Maybe we could do and year\nequals, how about 2005? 18265 17:27:46,991 --> 17:27:50,591 All right, so now we've got\nback just the ID of The Office 18266 17:27:52,572 --> 17:27:55,451 Let me turn on a timer\nwithin SQLite just 18267 17:27:55,451 --> 17:27:57,131 to get a sense of running time now. 18268 17:27:58,451 --> 17:28:01,391 Select star from shows, where\ntitle equals The Office 18269 17:28:04,192 --> 17:28:05,711 Let's just do titles for now. 18270 17:28:06,792 --> 17:28:08,802 All right, so not terribly long. 18271 17:28:08,802 --> 17:28:12,332 It found it pretty fast, but it looks\n 18272 17:28:12,332 --> 17:28:15,351 0.02 seconds, not bad for just a title. 18273 17:28:15,351 --> 17:28:18,551 But just to plant a seed, it\nturns out that we can probably 18274 17:28:20,271 --> 17:28:23,561 Let me create something called an\n 18275 17:28:23,561 --> 17:28:25,511 in CRUD for creating something. 18276 17:28:25,512 --> 17:28:28,152 And I'm going to call this title index. 18277 17:28:28,152 --> 17:28:32,682 And I'm going to create\nit on the shows table 18278 17:28:32,682 --> 17:28:34,614 specifically on the title column. 18279 17:28:34,614 --> 17:28:37,031 And we'll see in a moment what\nthis is going to do for me. 18280 17:28:38,262 --> 17:28:42,472 Took a moment, like 0.349 seconds,\n 18281 17:28:42,472 --> 17:28:46,932 But now watch, if I select star from\n 18282 17:28:46,932 --> 17:28:49,572 previously it took me 0.021 seconds. 18283 17:28:53,021 --> 17:28:56,531 Literally no time at all, or so low\n 18284 17:28:56,531 --> 17:28:58,811 And I'll do it again just\nto get a sense of things. 18285 17:29:00,201 --> 17:29:05,362 Now even though 0.021 seconds, not crazy\n 18286 17:29:05,362 --> 17:29:07,902 a lot of users running a real\nwebsite or real mobile app. 18287 17:29:07,902 --> 17:29:11,322 Every millisecond we can start to\n 18288 17:29:13,152 --> 17:29:17,171 Well, we actually just created\nsomething called an index. 18289 17:29:17,171 --> 17:29:19,301 And this is a nice way\nto tie in, now, some 18290 17:29:19,302 --> 17:29:21,762 of our week 5 discussion\nof data structures 18291 17:29:21,762 --> 17:29:23,592 and our week 3 discussion\nof running times. 18292 17:29:23,591 --> 17:29:26,621 An index in a database is\nsome kind of fancy data 18293 17:29:26,622 --> 17:29:31,452 structure that allows the database\n 18294 17:29:31,451 --> 17:29:35,921 Literally, as you just saw, these\n 18295 17:29:37,491 --> 17:29:39,641 And so when I first\nsearched for The Office 18296 17:29:39,641 --> 17:29:43,122 it was literally doing linear search,\n 18297 17:29:46,211 --> 17:29:48,671 It's not that slow, 0.021 seconds. 18298 17:29:48,671 --> 17:29:52,362 But that's relatively slow just\ntheoretically, algorithmically 18299 17:29:53,921 --> 17:29:57,432 But if you instead create\nan index using syntax 18300 17:29:57,432 --> 17:30:03,222 like this, which I just did, creating an\n 18301 17:30:03,222 --> 17:30:06,561 table, that's like giving the\ndatabase a clue in advance saying 18302 17:30:06,561 --> 17:30:10,002 hey, I know I'm going to search on\n 18303 17:30:10,002 --> 17:30:12,771 Do something with data\nstructures to speed things up. 18304 17:30:12,771 --> 17:30:15,371 And so if you think back to our\ndiscussion of data structures 18305 17:30:17,061 --> 17:30:21,401 Maybe it's using a trie or a hash\n 18306 17:30:21,402 --> 17:30:25,272 structure is generally going to lift\n 18307 17:30:26,152 --> 17:30:28,961 So it's just much faster\nto find data, especially 18308 17:30:28,961 --> 17:30:31,841 if it's sorting it now\nbased on title, and not 18309 17:30:31,841 --> 17:30:33,371 just storing it in one long list. 18310 17:30:33,372 --> 17:30:35,562 And in fact, in the world\nof relational databases 18311 17:30:35,561 --> 17:30:37,901 the type of structure that's\noften used in a database 18312 17:30:37,902 --> 17:30:39,262 is something called a B-tree. 18313 17:30:40,512 --> 17:30:44,382 Different use of the letter B, but it\n 18314 17:30:45,161 --> 17:30:47,021 It's not binary because\nsome of the nodes 18315 17:30:47,021 --> 17:30:50,231 might have more than\ntwo children or fewer 18316 17:30:50,232 --> 17:30:53,532 but it's a very wide but\nrelatively shallow tree. 18317 17:30:55,332 --> 17:30:59,112 And the upside of that is that if\n 18318 17:30:59,112 --> 17:31:01,402 the database can find it more quickly. 18319 17:31:01,402 --> 17:31:06,612 And the reason it took half a second,\n 18320 17:31:06,612 --> 17:31:10,601 is because SQLite needed to take\nsome non-zero amount of time 18321 17:31:10,601 --> 17:31:12,972 to just build up this tree in memory. 18322 17:31:12,972 --> 17:31:17,241 And it has algorithms for doing so based\n 18323 17:31:17,241 --> 17:31:20,682 But you spend a bit of time\nup front, a third of a second. 18324 17:31:22,811 --> 17:31:25,811 Every subsequent query, if I\nkeep doing it again and again 18325 17:31:25,811 --> 17:31:29,381 is going to be crazy\nlow, 0.000, maybe 0.001. 18326 17:31:29,381 --> 17:31:33,581 But an order of magnitude, a\nfactor of 10 or 100 faster than it 18327 17:31:36,131 --> 17:31:39,701 So we have these indexes which\nallow us to get at data faster. 18328 17:31:39,701 --> 17:31:42,671 But what if we want to\nactually get data that's 18329 17:31:42,671 --> 17:31:44,711 now across these multiple tables? 18330 17:31:45,582 --> 17:31:48,402 And how might these indices\nor indexes help further? 18331 17:31:48,402 --> 17:31:52,241 Well, it turns out there is\na way that we've seen already 18332 17:31:52,241 --> 17:31:54,851 indirectly to join two tables together. 18333 17:31:54,851 --> 17:31:58,752 Previously, when I selected\nthe ID of The Office 18334 17:31:58,752 --> 17:32:03,082 and then I searched for it in the other\n 18335 17:32:03,082 --> 17:32:05,752 I was joining two tables together. 18336 17:32:05,752 --> 17:32:08,241 And it turns out there's a\ncouple of ways to do this. 18337 17:32:08,241 --> 17:32:11,891 Let's go ahead now and, for instance,\n 18338 17:32:11,891 --> 17:32:14,021 Not just The Office\nbut all of them, too. 18339 17:32:14,021 --> 17:32:21,656 Unfortunately, if we look at our schema,\n 18340 17:32:21,656 --> 17:32:27,201 oh, shows over here has no\nmention of the TV stars in them. 18341 17:32:27,201 --> 17:32:30,471 And people have no mention of shows. 18342 17:32:30,472 --> 17:32:34,702 We somehow need to use this\ntable here to connect the two. 18343 17:32:34,701 --> 17:32:40,161 And this is called a join table, in the\n 18344 17:32:40,161 --> 17:32:43,131 it joins the two tables\ntogether logically. 18345 17:32:43,131 --> 17:32:47,091 And so if you're savvy enough with SQL,\n 18346 17:32:47,091 --> 17:32:51,351 earlier and like recombine\ntables by using these common IDs 18347 17:32:53,671 --> 17:32:58,072 Let me go ahead and figure out,\n 18348 17:32:58,072 --> 17:32:59,311 So how am I going to do this? 18349 17:32:59,311 --> 17:33:04,461 Well, if I select star from people,\n 18350 17:33:04,461 --> 17:33:06,182 fortunately, there's only one of them. 18351 17:33:06,182 --> 17:33:12,141 So this gives me back his name,\nhis ID, and his birth year. 18352 17:33:12,141 --> 17:33:14,302 But it's really only his\nID that I care about. 18353 17:33:15,021 --> 17:33:20,841 Because in order to get back his shows,\n 18354 17:33:20,841 --> 17:33:22,981 So I need to know his ID number. 18355 17:33:22,982 --> 17:33:24,932 So what could I do with this? 18356 17:33:24,932 --> 17:33:29,572 Well, remember the schema\nand the stars table. 18357 17:33:29,572 --> 17:33:33,171 I've just gotten, from the\npeople table, Steve Carell's ID. 18358 17:33:33,171 --> 17:33:38,871 I bet by transitivity I could\nnow use his person ID, his ID 18359 17:33:38,872 --> 17:33:41,242 to get back all of his show IDs. 18360 17:33:41,241 --> 17:33:44,511 And then once I've got all of his show\n 18361 17:33:44,512 --> 17:33:46,672 and get back all of his shows' titles. 18362 17:33:46,671 --> 17:33:50,781 So the answer is actually English\n 18363 17:33:51,631 --> 17:33:52,981 So let me go ahead and do this. 18364 17:33:52,982 --> 17:33:57,082 Let me, again, get Steve\nCarell's ID number, but not star. 18365 17:33:58,402 --> 17:34:00,952 It's a wildcard character in SQL. 18366 17:34:00,951 --> 17:34:03,651 Let me just select the\nID of Steve Carell. 18367 17:34:03,652 --> 17:34:06,982 And that gives me back 136,797. 18368 17:34:06,982 --> 17:34:08,782 And it's only giving me back one value. 18369 17:34:08,781 --> 17:34:12,051 The thing called ID is just\nthe column heading up above. 18370 17:34:12,052 --> 17:34:16,702 Now, suppose I want to\nselect all of the show IDs 18371 17:34:16,701 --> 17:34:18,801 that Steve Carell is affiliated with. 18372 17:34:18,802 --> 17:34:25,912 Let me select Show ID from stars,\nwhere the person ID in stars 18373 17:34:25,911 --> 17:34:28,822 happens to equal Steve Carell's ID. 18374 17:34:28,822 --> 17:34:32,661 So again, I'm building up my answer in\n 18375 17:34:32,661 --> 17:34:36,771 On the right, in parentheses,\nI'm getting Steve Carell's ID. 18376 17:34:36,771 --> 17:34:40,671 On the left, I am now\nselecting all of the show IDs 18377 17:34:40,671 --> 17:34:44,752 that have some connection with\n 18378 17:34:44,752 --> 17:34:47,192 This answer, too, is not\ngoing to be that illuminating. 18379 17:34:47,192 --> 17:34:50,762 It's just a whole bunch of integers\n 18380 17:34:50,762 --> 17:34:53,112 But let's take this one step further. 18381 17:34:53,112 --> 17:34:54,862 And even though my\ncode is getting long, I 18382 17:34:54,862 --> 17:34:57,262 could hit Enter and format\nit nicely, especially 18383 17:34:57,262 --> 17:34:59,002 if I were doing this in a code file. 18384 17:34:59,002 --> 17:35:00,921 But I'm just doing it\ninteractively for now. 18385 17:35:00,921 --> 17:35:04,761 Let's now select all of the\ntitles from the shows table 18386 17:35:04,762 --> 17:35:13,442 where the ID of the show is in\nthis following previous query. 18387 17:35:13,442 --> 17:35:15,082 So again, the query is getting long. 18388 17:35:15,082 --> 17:35:17,542 But notice, it's the\nthird and last step. 18389 17:35:17,542 --> 17:35:21,292 Select title from the shows\ntable, where the ID of the show 18390 17:35:21,292 --> 17:35:23,932 is in the list of all\nof the show IDs that 18391 17:35:23,932 --> 17:35:27,381 came back from the stars table\n 18392 17:35:27,381 --> 17:35:28,822 How did we get that person ID? 18393 17:35:30,021 --> 17:35:36,502 Well, I selected, in my innermost\n 18394 17:35:36,502 --> 17:35:38,781 So now, when I hit Enter, voila. 18395 17:35:38,781 --> 17:35:41,866 I get all of Steve Carell's\nTV shows up until now. 18396 17:35:41,866 --> 17:35:44,991 And if I want to tidy this up further,\n 18397 17:35:47,521 --> 17:35:50,881 Now I've got it all\nalphabetized as before. 18398 17:35:50,881 --> 17:35:53,421 So again, with SQL comes\nthe ability to search-- 18399 17:35:53,421 --> 17:35:56,631 I mean, look how quickly\nwe do this, 0.094 seconds 18400 17:35:56,631 --> 17:35:59,991 to search across three different\ntables to get back this answer. 18401 17:35:59,991 --> 17:36:04,161 But my data is now all neatly\ndesigned in individual tables 18402 17:36:04,161 --> 17:36:07,341 which is going to be important\n 18403 17:36:07,341 --> 17:36:09,681 But let me take this one step further. 18404 17:36:09,682 --> 17:36:12,271 Let me go ahead and do this. 18405 17:36:12,271 --> 17:36:16,921 Let me go ahead and point\nout that with this query 18406 17:36:16,921 --> 17:36:20,211 notice that I'm searching on-- 18407 17:36:20,211 --> 17:36:24,051 let's say I'm searching\non a person ID here. 18408 17:36:24,052 --> 17:36:27,752 And at the end here, I'm\nsearching on a name column here. 18409 17:36:27,752 --> 17:36:30,572 So let me actually go ahead and do this. 18410 17:36:30,572 --> 17:36:34,851 Let me go ahead and see\nif we can't speed this up. 18411 17:36:34,851 --> 17:36:38,432 This query at the moment\ntakes 0.092 seconds. 18412 17:36:38,432 --> 17:36:41,271 Let's see if we can't speed this\n 18413 17:36:41,271 --> 17:36:44,271 a few more of those B-trees\nin the databases memory. 18414 17:36:44,271 --> 17:36:49,581 Create an index called person index, and\n 18415 17:36:52,192 --> 17:36:53,830 It's taking a moment, taking a moment. 18416 17:36:53,830 --> 17:36:56,122 That's almost a full second\nbecause that's a big table. 18417 17:36:56,122 --> 17:37:00,391 Let's create another index called\nshow index on the stars table. 18418 17:37:00,891 --> 17:37:03,292 Because I want to search\nby the show ID also. 18419 17:37:03,292 --> 17:37:05,122 That was part of my big query. 18420 17:37:06,002 --> 17:37:09,152 OK, just more than\nabout 2/3 of a second. 18421 17:37:09,152 --> 17:37:11,781 Now let's create one last one,\nanother index called name index 18422 17:37:11,781 --> 17:37:14,502 but I could call these things\n 18423 17:37:14,902 --> 17:37:16,891 Because I'm also searching\non the name column. 18424 17:37:16,891 --> 17:37:19,072 So in short, I'm\ncreating indexes on each 18425 17:37:19,072 --> 17:37:22,792 of the columns that are somehow\ninvolved in my search query 18426 17:37:22,792 --> 17:37:25,022 going from one table to the other. 18427 17:37:25,021 --> 17:37:32,932 Now let's go back to the previous\nquery, which, recall, took-- 18428 17:37:36,112 --> 17:37:37,987 Well, it was roughly\nthis order of magnitude. 18429 17:37:37,987 --> 17:37:39,482 We're not seeing the data now. 18430 17:37:39,482 --> 17:37:42,502 But let me go ahead and run\nmy original big query once. 18431 17:37:42,502 --> 17:37:45,722 And boom, we're down to almost nothing. 18432 17:37:45,722 --> 17:37:48,202 So again, creating\nthese indexes in memory 18433 17:37:48,201 --> 17:37:52,981 has the effect of rapidly\nspeeding up our computation time. 18434 17:37:52,982 --> 17:37:56,482 Now if you've ever used, for instance,\n 18435 17:37:56,482 --> 17:38:00,472 here on campus, or Yale's analogue, you\n 18436 17:38:00,472 --> 17:38:04,671 This could be one of the reasons why\n 18437 17:38:04,671 --> 17:38:07,011 thousands of courses\ntend to be slow, if 18438 17:38:07,012 --> 17:38:10,222 and I'm only conjecturing, if the\n 18439 17:38:10,222 --> 17:38:12,232 If you're building your\nown web application 18440 17:38:12,232 --> 17:38:14,512 and you're finding that users\nare waiting and waiting 18441 17:38:14,512 --> 17:38:17,632 and things are spinning and spinning,\n 18442 17:38:17,631 --> 17:38:21,112 Well, it could absolutely just be bad\n 18443 17:38:21,112 --> 17:38:23,612 Or it might be that you\nhaven't thought about, well 18444 17:38:23,612 --> 17:38:27,112 what column should be optimized\nfor searches and filtration 18445 17:38:27,112 --> 17:38:31,046 like I've done here in order\nto speed up subsequent queries? 18446 17:38:31,046 --> 17:38:33,171 Again, from the outside\nin, we can only conjecture. 18447 17:38:33,171 --> 17:38:36,921 But ultimately, this is\njust one of the things that 18448 17:38:36,921 --> 17:38:39,650 explains performance problems as well. 18449 17:38:39,650 --> 17:38:42,442 All right, let's point out just a\n 18450 17:38:42,442 --> 17:38:45,112 and then we'll consider,\nbigger picture, some problems 18451 17:38:45,112 --> 17:38:47,451 that might arise in this world. 18452 17:38:47,451 --> 17:38:52,221 If these nested, nested queries\nstart to get a little much 18453 17:38:52,222 --> 17:38:54,502 there are other ways,\njust so you've seen it 18454 17:38:54,502 --> 17:38:57,262 that you can execute\nsimilar logic in SQL. 18455 17:38:57,262 --> 17:38:59,752 For instance, if I\nknow in advance that I 18456 17:38:59,752 --> 17:39:04,732 want to connect Steve Carell to\n 18457 17:39:04,732 --> 17:39:06,532 we can do something more like this. 18458 17:39:06,531 --> 17:39:17,391 Select title from the people table,\n 18459 17:39:21,832 --> 17:39:25,124 And again, this is not something you'll\n 18460 17:39:25,124 --> 17:39:29,422 But just so you've seen other\n 18461 17:39:30,502 --> 17:39:35,031 This is an explicit way to say, take\n 18462 17:39:35,031 --> 17:39:37,371 table in the other hand,\nand somehow join them 18463 17:39:37,372 --> 17:39:39,332 as I keep doing with my fingertips here. 18464 17:39:40,951 --> 17:39:45,711 Join them so that the people, the ID\n 18465 17:39:45,711 --> 17:39:48,601 with the person ID in the stars table. 18466 17:39:48,601 --> 17:39:50,601 But that's not quite everything. 18467 17:39:50,601 --> 17:39:54,082 I could also say, join\nfurther on the shows table 18468 17:39:54,082 --> 17:40:00,632 where the stars show ID\nequals the shows ID column. 18469 17:40:01,881 --> 17:40:11,331 That's saying, go further and join\n 18470 17:40:11,332 --> 17:40:14,332 joining the show ID\ncolumn with the ID column. 18471 17:40:14,332 --> 17:40:17,092 Again, this starts to get a\nlittle messy to think about. 18472 17:40:17,091 --> 17:40:21,171 But now I can just say, where name\n 18473 17:40:21,171 --> 17:40:24,411 I can do in one query what previously\n 18474 17:40:24,411 --> 17:40:25,941 and get back the same answers. 18475 17:40:25,942 --> 17:40:30,722 And I can still add in my order\nby title to get back the result. 18476 17:40:30,722 --> 17:40:35,122 And if I do this a little more\n 18477 17:40:36,091 --> 17:40:41,961 Let me type this out by adding a\n 18478 17:40:41,961 --> 17:40:43,461 I'm going to leave it alone for now. 18479 17:40:43,461 --> 17:40:46,042 We can type it on multiple\nlines in other contexts. 18480 17:40:46,042 --> 17:40:49,042 And let me do one last thing. 18481 17:40:50,353 --> 17:40:52,311 I'm going to show it,\nbut this is not something 18482 17:40:52,311 --> 17:40:53,781 you should ingrain just yet either. 18483 17:40:53,781 --> 17:40:56,961 Select title from\npeople, stars, and shows. 18484 17:40:56,961 --> 17:41:00,201 If you know in advance that you want\n 18485 17:41:00,201 --> 17:41:03,471 you can just enumerate them,\none table name after the other. 18486 17:41:03,472 --> 17:41:08,838 And then you can say where\npeople.ID equals stars.personID. 18487 17:41:08,838 --> 17:41:10,671 And now I'm hitting\nEnter so that it formats 18488 17:41:10,671 --> 17:41:12,411 a little more readably on my screen. 18489 17:41:12,411 --> 17:41:20,481 And stars.showID equals shows.ID,\n 18490 17:41:20,482 --> 17:41:25,072 In short, you specify that you\n 18491 17:41:26,031 --> 17:41:31,792 And then you tell the database how to\n 18492 17:41:31,792 --> 17:41:35,031 that is, the columns that\nhave those integers in common. 18493 17:41:35,031 --> 17:41:38,061 If I hit Enter now, I get\nthe same exact results, ever 18494 17:41:38,061 --> 17:41:41,451 more so if I also add\nin an order by title. 18495 17:41:43,612 --> 17:41:45,531 That's why I didn't\nwant to do this earlier. 18496 17:41:45,531 --> 17:41:48,531 I have to go back through my history\n 18497 17:41:48,531 --> 17:41:49,981 the multi-line query this time. 18498 17:41:52,622 --> 17:41:56,707 But this is only to say that, even\n 18499 17:41:56,707 --> 17:41:59,332 more sophisticated, and we put\nsome of it over here, some of it 18500 17:41:59,332 --> 17:42:03,472 over here, some of it over here so as to\n 18501 17:42:03,472 --> 17:42:07,252 like putting commas in the data, we\n 18502 17:42:07,252 --> 17:42:09,622 that we might want across\nthese several tables. 18503 17:42:09,622 --> 17:42:13,922 And using indexes, we can\nsignificantly speed up these processes 18504 17:42:13,921 --> 17:42:17,481 so as to handle 10 times as\nmany, a 100 times as many users 18505 17:42:17,482 --> 17:42:19,012 on the same actual database. 18506 17:42:19,012 --> 17:42:20,362 There is going to be a downside. 18507 17:42:20,362 --> 17:42:22,881 And thinking back to our\ndiscussion of algorithms and data 18508 17:42:22,881 --> 17:42:27,461 structures in past weeks, what might be\n 18509 17:42:27,461 --> 17:42:31,451 Because as of now, I created four\n 18510 17:42:31,451 --> 17:42:34,901 the title column, and\nsome other columns, too. 18511 17:42:34,902 --> 17:42:37,272 Why wouldn't I just go\nahead and index everything 18512 17:42:37,271 --> 17:42:39,731 if it's clearly speeding things up? 18513 17:42:41,112 --> 17:42:44,232 Any time you're starting to benefit\n 18514 17:42:44,232 --> 17:42:47,292 odds are you're sacrificing\nspace, or vice versa. 18515 17:42:47,292 --> 17:42:50,741 And probably indexing absolutely\neverything is a little dumb 18516 17:42:50,741 --> 17:42:54,771 because you're going to waste way more\n 18517 17:42:54,771 --> 17:42:56,951 So figuring out where the\nright inflection point is 18518 17:42:56,951 --> 17:43:01,752 is part of the process of designing and\n 18519 17:43:01,752 --> 17:43:06,252 Now unfortunately, a whole lot of\n 18520 17:43:06,252 --> 17:43:10,211 and they continue to in the real\n 18521 17:43:10,211 --> 17:43:12,101 And in fact, here on\nout, if you're reading 18522 17:43:12,101 --> 17:43:16,871 something technical about SQL databases,\n 18523 17:43:16,872 --> 17:43:20,002 and passwords leaking out,\nunfortunately, all too often 18524 17:43:20,002 --> 17:43:22,847 it is because of what are\ncalled SQL injection attacks. 18525 17:43:22,847 --> 17:43:24,972 And just to give you a\nsense now to counterbalance 18526 17:43:24,972 --> 17:43:26,930 maybe [INAUDIBLE] enthusiasm\nfor like, oh, that 18527 17:43:26,930 --> 17:43:28,961 was neat how we can\ndo things so quickly. 18528 17:43:28,961 --> 17:43:32,021 With great power comes\nresponsibility in this world, too. 18529 17:43:32,021 --> 17:43:34,661 And so many people introduce\nbugs into their code 18530 17:43:34,661 --> 17:43:42,501 by not quite appreciating how it is the\n 18531 17:43:43,732 --> 17:43:46,542 Here, for instance, is a\ntypical login screen for Yale. 18532 17:43:46,542 --> 17:43:49,122 And here's the analogue for\nHarvard where you're prompted 18533 17:43:49,122 --> 17:43:51,792 every day probably, for your\nusername and your password 18534 17:43:51,792 --> 17:43:53,802 your email address and\nyour password here. 18535 17:43:53,802 --> 17:43:57,762 Suppose, though, that\nbehind this login page 18536 17:43:57,762 --> 17:44:00,372 whether Harvard's or Yale's,\nthere's some website. 18537 17:44:00,372 --> 17:44:03,612 And that website is using\nSQL underneath the hood 18538 17:44:03,612 --> 17:44:06,042 to store all of the\nHarvard or Yale people's 18539 17:44:06,042 --> 17:44:09,281 usernames, passwords, ID\nnumbers, courses, transcripts 18540 17:44:10,311 --> 17:44:12,822 So there's a SQL database\nunderneath the website. 18541 17:44:12,822 --> 17:44:15,701 Well, what might go\nwrong with this process? 18542 17:44:15,701 --> 17:44:18,191 Unfortunately, there's\nsome special syntax in SQL 18543 17:44:18,192 --> 17:44:19,872 just like there is in C and Python. 18544 17:44:19,872 --> 17:44:22,302 For instance, there are\ncomments in SQL, too. 18545 17:44:22,302 --> 17:44:26,022 If you do two hyphens, dash,\ndash, that's a comment in SQL. 18546 17:44:26,021 --> 17:44:31,511 And if you, the programmer, aren't\n 18547 17:44:31,512 --> 17:44:34,902 such that you defend against\npotentially adversarial attacks 18548 17:44:34,902 --> 17:44:36,502 you might do something like this. 18549 17:44:36,502 --> 17:44:41,412 Suppose that I somewhat\nmaliciously or curiously log in 18550 17:44:41,411 --> 17:44:44,471 by typing my username,\nMalan@harvard.edu, and then maybe 18551 17:44:44,472 --> 17:44:46,332 a single quote and a dash, dash. 18552 17:44:47,021 --> 17:44:50,201 Because I'm trying to suss out\nif there is a vulnerability here 18553 17:44:53,252 --> 17:44:56,502 But if I were the owner of the website\n 18554 17:44:56,502 --> 17:45:00,641 I might try using potentially\ndangerous characters in my input. 18555 17:45:01,631 --> 17:45:05,682 Because single quote is used for\n 18556 17:45:05,682 --> 17:45:07,152 single quotes or double quotes. 18557 17:45:07,152 --> 17:45:10,272 Dash, dash, I claim now,\nis used for commenting. 18558 17:45:10,271 --> 17:45:13,301 But let's now imagine what\nthe code underneath the hood 18559 17:45:13,302 --> 17:45:17,502 might be for something like\nYale's login or Harvard's login. 18560 17:45:17,502 --> 17:45:19,942 What if it's code that looks like this? 18561 17:45:19,942 --> 17:45:21,882 So let me read it from left to right. 18562 17:45:21,881 --> 17:45:26,051 Suppose that they are using something\n 18563 17:45:26,052 --> 17:45:28,572 and they've got some SQL\ntyped into the website that 18564 17:45:28,572 --> 17:45:32,502 says select star from users,\nwhere username equals this 18565 17:45:34,391 --> 17:45:37,851 And they're plugging in\nusername and password. 18566 17:45:38,949 --> 17:45:41,531 Well, when the user types their\nusername password, hits Enter 18567 17:45:41,531 --> 17:45:44,262 I probably want to select\nthat user from my database 18568 17:45:44,262 --> 17:45:46,362 to see if the username\nand passwords match. 18569 17:45:46,362 --> 17:45:49,061 So the underlying SQL\nmight be, select star 18570 17:45:49,061 --> 17:45:51,131 from users, where username\nequals question mark 18571 17:45:51,131 --> 17:45:52,548 and password equals question mark. 18572 17:45:57,341 --> 17:46:02,771 And if we get back one row,\npresumably Malan@harvard.edu 18573 17:46:04,311 --> 17:46:06,531 We should let him proceed\nfrom there on out. 18574 17:46:06,531 --> 17:46:10,481 So that's some pseudo code, if\nyou will, for this scenario. 18575 17:46:10,482 --> 17:46:14,922 What if, though, this code is not\n 18576 17:46:14,921 --> 17:46:16,841 is, and isn't using question marks? 18577 17:46:16,841 --> 17:46:20,098 So the question mark syntax\nis a fairly common SQL thing 18578 17:46:20,099 --> 17:46:22,182 where the question marks\nare used as placeholders 18579 17:46:22,182 --> 17:46:24,732 just like in printf, percent S was. 18580 17:46:24,732 --> 17:46:28,242 But this function, db.execute\nfrom CS50's library 18581 17:46:28,241 --> 17:46:30,761 and third-party libraries\nas well, is also 18582 17:46:30,762 --> 17:46:33,132 doing some good stuff\nwith these question marks 18583 17:46:33,131 --> 17:46:35,171 and defending against\nthe following attack. 18584 17:46:35,171 --> 17:46:38,261 Suppose that you were not using\na third-party library like ours 18585 17:46:38,262 --> 17:46:41,832 and you were just manually constructing\n 18586 17:46:41,832 --> 17:46:45,281 You were to do something like this\n 18587 17:46:45,281 --> 17:46:47,141 You're comfortable with\nformat strings now. 18588 17:46:47,141 --> 17:46:50,224 You've gotten into the habit of using\n 18589 17:46:50,224 --> 17:46:52,572 Suppose that you, the\naspiring programmer 18590 17:46:52,572 --> 17:46:55,002 is just using techniques\nthat you've been taught. 18591 17:46:55,002 --> 17:46:58,002 So you have an f-string\nwith select star from users 18592 17:46:58,002 --> 17:47:01,811 where username equals, quote,\n 18593 17:47:01,811 --> 17:47:06,612 And password equals, quote,\nunquote, "password" in curly braces. 18594 17:47:06,612 --> 17:47:09,612 As of what, two weeks\nago, this was perfectly 18595 17:47:09,612 --> 17:47:14,802 legitimate technique in Python\nto plug in values into a string. 18596 17:47:14,802 --> 17:47:18,972 But notice if you are using\nsingle quotes yourself 18597 17:47:18,972 --> 17:47:24,092 and the user has typed in single\nquotes to their input, what 18598 17:47:25,262 --> 17:47:29,851 Where are we going with this if you're\n 18599 17:47:29,851 --> 17:47:33,691 into your own prepared string of text? 18600 17:47:40,942 --> 17:47:46,012 Worst case, they could insert what is\n 18601 17:47:47,152 --> 17:47:50,482 Generally speaking, if you're using\n 18602 17:47:50,482 --> 17:47:52,342 to surround the user's\ninput, you'd better 18603 17:47:52,341 --> 17:47:54,591 hope that they don't have\nan apostrophe in their name. 18604 17:47:54,591 --> 17:47:57,216 Or you better hope that they\ndon't type a single quote as well. 18605 17:47:57,216 --> 17:48:01,366 Because what if their single quote\n 18606 17:48:01,366 --> 17:48:03,241 and then the rest of\nthis is somehow ignored? 18607 17:48:03,241 --> 17:48:04,671 Well, let's consider\nhow this might happen. 18608 17:48:05,851 --> 17:48:08,182 This got a little\nblurry here, but let me 18609 17:48:08,182 --> 17:48:10,192 plug in here-- wow, that looks awful. 18610 17:48:13,281 --> 17:48:15,771 Just change this to white\nso it's more readable. 18611 17:48:15,771 --> 17:48:22,072 What happens if the\nuser does this instead? 18612 17:48:22,072 --> 17:48:24,652 They type in, like I\ndid into the screenshot 18613 17:48:24,652 --> 17:48:28,101 'Malan@harvard.edu,'\nsingle quote, dash, dash. 18614 17:48:28,101 --> 17:48:30,411 What has just happened\nlogically, even though we've 18615 17:48:30,411 --> 17:48:32,241 only just begun with SQL today? 18616 17:48:32,241 --> 17:48:37,491 Well, select star from users, where\n 18617 17:48:38,661 --> 17:48:42,681 What's bad about the rest of this? 18618 17:48:42,682 --> 17:48:45,093 Dash, dash, I claim,\nmeans a comment, which 18619 17:48:45,093 --> 17:48:47,551 means my color coding is going\nto be a little blurry again. 18620 17:48:47,552 --> 17:48:50,482 But everything after the\ndash, dash is just ignored. 18621 17:48:50,482 --> 17:48:52,782 The logic, then, of\nthe SQL query, then, is 18622 17:48:52,781 --> 17:48:56,101 to just say, select\nMalan@harvard.edu from the database 18623 17:48:56,101 --> 17:48:58,832 not even checking the password anymore. 18624 17:48:58,832 --> 17:49:01,522 Therefore, you will get\nback at least one row. 18625 17:49:01,521 --> 17:49:06,531 So length of rows will equal 1, and so\n 18626 17:49:06,531 --> 17:49:09,531 logs the user in, gives them\naccess to my my.harvard account 18627 17:49:10,491 --> 17:49:15,981 And they've pretended to be me simply\n 18628 17:49:15,982 --> 17:49:17,787 dash in the username field. 18629 17:49:17,786 --> 17:49:19,911 Again, please don't go\nstart doing this later today 18630 17:49:19,911 --> 17:49:21,481 on Harvard, Yale, or other websites. 18631 17:49:21,482 --> 17:49:23,012 But it could be as simple as that. 18632 17:49:23,512 --> 17:49:25,372 Because the programmer\npracticed what they 18633 17:49:25,372 --> 17:49:29,452 were taught, which was just to\nuse curly braces to plug in 18634 17:49:30,902 --> 17:49:33,932 But if you don't understand how the\n 18635 17:49:33,932 --> 17:49:37,597 and if you don't distrust your users\n 18636 17:49:37,597 --> 17:49:39,472 out there there's going\nto be, unfortunately 18637 17:49:39,472 --> 17:49:44,722 some adversary who just wants to try\n 18638 17:49:45,832 --> 17:49:48,322 This is what's known as\na SQL injection attack 18639 17:49:48,322 --> 17:49:52,341 because the user can type something\n 18640 17:49:52,341 --> 17:49:56,451 and trick your database into doing\n 18641 17:49:56,451 --> 17:50:00,171 like, for instance, logging the user in. 18642 17:50:00,171 --> 17:50:02,222 Worst case, they could\neven do something else. 18643 17:50:02,222 --> 17:50:06,832 Maybe the user types a semicolon, then\n 18644 17:50:06,832 --> 17:50:10,522 You could imagine doing semicolon\nupdate table grades, where 18645 17:50:10,521 --> 17:50:14,481 name equals Malan, and set the\ngrade equal to A instead of B 18646 17:50:16,012 --> 17:50:18,891 The ability to inject\nSQL into the database 18647 17:50:18,891 --> 17:50:22,161 means you can do anything you want with\n 18648 17:50:25,341 --> 17:50:28,221 And now, just a quick, little\n 18649 17:50:34,752 --> 17:50:38,362 OK, to, like, one of us, two of us. 18650 17:50:39,741 --> 17:50:41,902 All right, so let's move\non to one last condition. 18651 17:50:41,902 --> 17:50:44,472 There's one other problem\nthat can go awry here. 18652 17:50:44,472 --> 17:50:45,722 Oh, and I should explain this. 18653 17:50:45,722 --> 17:50:50,842 So this is an allusion to the son,\n 18654 17:50:50,841 --> 17:50:54,151 The word drop, table, students, and\n 18655 17:50:54,152 --> 17:50:56,781 This is humor that only\nCS people would understand 18656 17:50:56,781 --> 17:51:00,381 because it's the mom realizing,\n 18657 17:51:01,650 --> 17:51:04,942 Less funny when you explain it, but once\n 18658 17:51:06,802 --> 17:51:10,192 So one final threat, now\nthat you are graduating 18659 17:51:10,192 --> 17:51:14,662 to the world of proper databases\nand away from CSV files alone. 18660 17:51:14,661 --> 17:51:17,511 Things can go wrong\nwhen using databases 18661 17:51:17,512 --> 17:51:21,180 and honestly, even using CSV\nfiles if you have multiple users. 18662 17:51:21,180 --> 17:51:22,972 And thus far, you and\nI have had the luxury 18663 17:51:22,972 --> 17:51:25,889 in almost every program we've written\n 18664 17:51:25,889 --> 17:51:27,172 It's just you using your code. 18665 17:51:27,171 --> 17:51:30,112 And even if your teaching fellow\nor TA is using it, probably 18666 17:51:31,402 --> 17:51:36,112 But the world gets interesting if you\n 18667 17:51:36,112 --> 17:51:40,072 on websites, such that now you might\n 18668 17:51:40,072 --> 17:51:42,771 to log in at the same time,\nliterally clicking a button 18669 17:51:42,771 --> 17:51:44,752 at the same, or nearly the same time. 18670 17:51:44,752 --> 17:51:47,722 What happens, then, if\na computer is trying 18671 17:51:47,722 --> 17:51:50,632 to handle requests from two\ndifferent people at once 18672 17:51:50,631 --> 17:51:52,822 as might happen all\nthe time on a website? 18673 17:51:52,822 --> 17:51:54,951 You might get what are\ncalled race conditions. 18674 17:51:54,951 --> 17:51:58,401 And this is a problem in computing in\n 18675 17:51:58,402 --> 17:52:02,302 with Python, really just any\ntime you have shared data 18676 17:52:02,302 --> 17:52:04,492 like a database, as follows. 18677 17:52:04,491 --> 17:52:08,961 This apparently is one of the\nmost liked Instagram posts ever. 18678 17:52:08,961 --> 17:52:11,451 It is literally just\na picture of an egg. 18679 17:52:11,451 --> 17:52:12,987 Has anyone clicked on this egg? 18680 17:52:15,561 --> 17:52:19,222 So go search for this photo if you'd\n 18681 17:52:19,222 --> 17:52:21,452 The account is world_record_egg. 18682 17:52:21,451 --> 17:52:24,381 This is just a screenshot of\n 18683 17:52:24,381 --> 17:52:25,881 If you're in the habit\nof using Instagram 18684 17:52:25,881 --> 17:52:28,839 or like any social media site, there's\n 18685 17:52:28,839 --> 17:52:30,261 or a heart button these days. 18686 17:52:30,262 --> 17:52:32,242 And that's actually a\nreally hard problem. 18687 17:52:32,241 --> 17:52:35,391 Such a simple idea to count\nthe number of likes something 18688 17:52:35,391 --> 17:52:38,332 has, but that means\nsomeone has to click on it. 18689 17:52:38,332 --> 17:52:40,252 Your code has to detect the click. 18690 17:52:40,252 --> 17:52:43,201 Your code has to update the database,\n 18691 17:52:43,201 --> 17:52:48,081 even if multiple people are perhaps\n 18692 17:52:48,082 --> 17:52:53,692 And unfortunately, bad things can\n 18693 17:52:53,692 --> 17:52:55,882 at the same time on a computer. 18694 17:52:57,031 --> 17:53:01,012 So here's some more code, half\n 18695 17:53:01,921 --> 17:53:05,572 Suppose that what happens when you,\n 18696 17:53:05,572 --> 17:53:08,811 on the like button on\nthe Instagram post. 18697 17:53:08,811 --> 17:53:13,101 Suppose that code, like the following,\n 18698 17:53:13,101 --> 17:53:19,531 db.execute of select likes from\n 18699 17:53:22,432 --> 17:53:24,622 I'm assuming that that\nphotograph has a unique ID. 18700 17:53:24,622 --> 17:53:28,012 It's some big integer, whatever\nit was, randomly assigned. 18701 17:53:28,012 --> 17:53:30,472 I'm assuming that when\nyou click on the heart 18702 17:53:30,472 --> 17:53:33,502 the unique ID is somehow\nsent to Instagram servers 18703 17:53:33,502 --> 17:53:36,082 so that their code can call it ID. 18704 17:53:36,082 --> 17:53:39,171 And I'm assuming that Instagram\nis using its SQL database 18705 17:53:39,171 --> 17:53:43,131 and selecting, from a posts\ntable, the current number of likes 18706 17:53:43,131 --> 17:53:46,502 of that egg for that given ID number. 18707 17:53:47,002 --> 17:53:50,294 Because I need to know how many likes it\n 18708 17:53:50,294 --> 17:53:51,531 and then update the database. 18709 17:53:51,531 --> 17:53:55,051 I need to select the data, then\nI need to update the data here. 18710 17:53:55,552 --> 17:53:59,122 So in some Python code here,\nlet's store, in a variable called 18711 17:53:59,122 --> 17:54:03,292 likes, whatever comes back in the\n 18712 17:54:03,292 --> 17:54:06,002 Again, this is new syntax\nspecific to our library 18713 17:54:06,002 --> 17:54:09,171 but a common way of getting back\nfirst row and the column called 18714 17:54:10,141 --> 17:54:12,262 So at this point in the\nstory, likes is storing 18715 17:54:12,262 --> 17:54:14,804 the total number of likes, in\nthe millions or whatever it is 18716 17:54:17,241 --> 17:54:21,741 Execute update posts,\nset the number of likes 18717 17:54:21,741 --> 17:54:25,949 equal to this value, where the\nID of the post equals this value. 18718 17:54:25,949 --> 17:54:27,531 What do I want to update the likes to? 18719 17:54:27,531 --> 17:54:31,881 Whatever likes currently is plus\n1, and then plugging in the ID. 18720 17:54:33,682 --> 17:54:37,072 I'm checking the value of\nthe likes, and maybe it's 10. 18721 17:54:37,072 --> 17:54:40,372 I'm changing 10 to 11 and\nthen updating the table. 18722 17:54:40,372 --> 17:54:43,732 But a problem can arise\nif two people have 18723 17:54:43,732 --> 17:54:48,352 clicked on that egg at roughly the\n 18724 17:54:49,682 --> 17:54:52,192 Well, in the world of\ndatabases and servers 18725 17:54:52,192 --> 17:54:56,461 and the Instagrams of the world have\n 18726 17:54:56,461 --> 17:55:00,771 So they can support millions,\nbillions even, of users nowadays. 18727 17:55:02,252 --> 17:55:05,872 Well, typically code like this\nis not what we'll call atomic. 18728 17:55:05,872 --> 17:55:09,772 To be atomic means that it all\nexecutes together or not at all. 18729 17:55:09,771 --> 17:55:14,841 Rather, code typically is executed,\n 18730 17:55:14,841 --> 17:55:18,451 And if your code is running on a server\n 18731 17:55:18,451 --> 17:55:20,871 which is absolutely the case\nfor an app like Instagram 18732 17:55:20,872 --> 17:55:23,902 if you and I click on the\nheart at roughly the same time 18733 17:55:23,902 --> 17:55:27,502 for efficiency, the computer,\nthe server, owned by Instagram 18734 17:55:27,502 --> 17:55:29,841 might execute this line of code for me. 18735 17:55:29,841 --> 17:55:31,878 Then it might execute\nthis line of code for you. 18736 17:55:31,879 --> 17:55:34,461 Then this line of code for me,\nthen this line of code for you 18737 17:55:34,461 --> 17:55:37,044 then this line of code for me,\nthen this line of code for you. 18738 17:55:37,044 --> 17:55:41,891 That is to say, our queries might\n 18739 17:55:41,891 --> 17:55:44,711 Because it'd be a little obnoxious\n 18740 17:55:44,711 --> 17:55:47,381 I'm blocked out while you're\ninteracting with the site. 18741 17:55:47,381 --> 17:55:50,119 It'd be a lot nicer for efficiency\nand fairness if somehow they 18742 17:55:50,120 --> 17:55:52,662 do a little bit of work for me,\na little bit of work for you 18743 17:55:52,661 --> 17:55:55,881 and back and forth, and back and\nforth, equitably on the server. 18744 17:55:55,881 --> 17:55:58,661 So that's what typically happens\nby default. These lines of code 18745 17:56:00,762 --> 17:56:05,112 And they can happen in alternating\norder with other users. 18746 17:56:05,112 --> 17:56:07,150 You can get them combined like this. 18747 17:56:07,150 --> 17:56:11,182 Same order top to bottom, but other\n 18748 17:56:11,182 --> 17:56:15,342 So suppose that the number of\n 18749 17:56:15,341 --> 17:56:19,421 And suppose that Carter and I both click\n 18750 17:56:19,421 --> 17:56:21,761 And suppose this line of\ncode gets executed for me 18751 17:56:21,762 --> 17:56:25,211 and that gives me a value\nin likes, ultimately, of 10. 18752 17:56:25,211 --> 17:56:28,631 Suppose, then, that the computer takes\n 18753 17:56:28,631 --> 17:56:31,031 does the same code for\nCarter, and gets back 18754 17:56:31,031 --> 17:56:33,252 what value for the\ncurrent number of likes? 18755 17:56:34,512 --> 17:56:36,550 Because mine has not been recorded yet. 18756 17:56:36,550 --> 17:56:39,252 At this point in the story,\nsomewhere in the computer's memory 18757 17:56:39,252 --> 17:56:41,561 there's a likes variable\nfor me, storing 10. 18758 17:56:41,561 --> 17:56:44,741 There's a likes variable\nstoring 10 for Carter. 18759 17:56:44,741 --> 17:56:46,631 Then this line of code executes for me. 18760 17:56:46,631 --> 17:56:50,560 It updates the database to be likes\n 18761 17:56:50,561 --> 17:56:55,781 Then Carter's code is executed,\n 18762 17:56:59,502 --> 17:57:03,550 Because his value of likes happened\n 18763 17:57:03,550 --> 17:57:06,881 And so the metaphor here, that if we\n 18764 17:57:06,881 --> 17:57:10,131 actually act out, is something that was\n 18765 17:57:10,131 --> 17:57:15,640 systems class, whereby the most similar\n 18766 17:57:15,641 --> 17:57:17,712 if you've got a mini\nfridge in your dorm room. 18767 17:57:17,711 --> 17:57:23,781 And one of you and your roommates comes\n 18768 17:57:23,781 --> 17:57:26,502 oh, we're out of milk, was\nhow the story went in my day. 18769 17:57:26,502 --> 17:57:30,525 So you close the refrigerator, and\n 18770 17:57:30,525 --> 17:57:31,900 and get in line to buy some milk. 18771 17:57:31,900 --> 17:57:33,650 Meanwhile, your roommate comes home. 18772 17:57:33,650 --> 17:57:38,265 They, too, inspect the state of your\n 18773 17:57:38,266 --> 17:57:40,391 open the door, and realizes,\noh, we're out of milk. 18774 17:57:41,322 --> 17:57:43,362 Close the fridge, go\nacross the street, and head 18775 17:57:43,362 --> 17:57:45,400 to maybe a different store,\nor the line is long enough 18776 17:57:45,400 --> 17:57:47,192 that you don't see each\nother at the store. 18777 17:57:47,192 --> 17:57:51,042 So long story short, you both eventually\n 18778 17:57:51,042 --> 17:57:52,900 now there's milk from\nyour other roommate 18779 17:57:52,900 --> 17:57:56,021 there because you both\nmade a decision on this 18780 17:57:56,021 --> 17:58:01,150 based on the state of a variable\n 18781 17:58:01,150 --> 17:58:03,042 And you didn't somehow communicate. 18782 17:58:03,042 --> 17:58:05,952 Now in the real world, this\nis absolutely solvable. 18783 17:58:05,951 --> 17:58:09,461 How would you fix this or avoid\nthis problem in the real world? 18784 17:58:09,461 --> 17:58:11,112 Literally, own roommate, own fridge. 18785 17:58:11,112 --> 17:58:13,092 AUDIENCE: Text your\nroommate [INAUDIBLE].. 18786 17:58:14,091 --> 17:58:15,824 Let them know, so somehow communicate. 18787 17:58:15,824 --> 17:58:18,281 And in fact, the terminology\nhere would be multiple threads 18788 17:58:18,281 --> 17:58:20,800 can somehow intercommunicate\nby having shared state 18789 17:58:20,800 --> 17:58:22,572 like the iMessage thread on your phone. 18790 17:58:23,531 --> 17:58:26,832 You could, more dramatically,\nlock the refrigerator somehow 18791 17:58:26,832 --> 17:58:30,912 thereby making the milk\npurchasing process atomic. 18792 17:58:30,911 --> 17:58:33,730 The fundamental problem is\nthat for efficiency, again 18793 17:58:33,730 --> 17:58:37,031 computers tend to\nintermingle logic that needs 18794 17:58:37,031 --> 17:58:41,322 to happen when it's happening across\n 18795 17:58:42,521 --> 17:58:45,161 You need to make sure that all\nthree of these lines of code 18796 17:58:45,161 --> 17:58:48,610 execute for me, and then\nfor Carter, and then for you 18797 17:58:48,610 --> 17:58:51,050 if you want to ensure that\nthis count is correct. 18798 17:58:51,050 --> 17:58:54,561 And for years, when social media\n 18799 17:58:54,561 --> 17:58:56,442 this was a super hard problem. 18800 17:58:56,442 --> 17:58:59,262 Twitter used to go down all\nof the time, and tweets 18801 17:58:59,262 --> 17:59:01,601 and retweets were a thing\nthat were similarly happening 18802 17:59:02,832 --> 17:59:04,362 These are hard problems to solve. 18803 17:59:04,362 --> 17:59:06,031 And thankfully, there are solutions. 18804 17:59:06,031 --> 17:59:08,781 And we won't get into the weeds\n 18805 17:59:08,781 --> 17:59:11,442 but know that there are\nsolutions in the form of things 18806 17:59:11,442 --> 17:59:14,952 called locks, which I use that\n 18807 17:59:14,951 --> 17:59:18,911 Software locks can allow you to\n 18808 17:59:18,911 --> 17:59:20,891 look at it until you're done with it. 18809 17:59:20,891 --> 17:59:23,112 There are things called\ntransactions, which 18810 17:59:23,112 --> 17:59:26,442 allow you to do the equivalent of\n 18811 17:59:26,442 --> 17:59:29,472 out your roommate from accessing\nthat same variable, too 18812 17:59:29,472 --> 17:59:32,262 but for slightly less amount of time. 18813 17:59:32,262 --> 17:59:34,222 There are solutions to these problems. 18814 17:59:34,222 --> 17:59:37,991 So for instance, in Python,\nthe same code now in green 18815 17:59:37,991 --> 17:59:39,701 might look a little something like this. 18816 17:59:39,701 --> 17:59:42,491 When you know that something\nhas to happen all at once 18817 17:59:42,491 --> 17:59:46,362 altogether, you first begin a\n 18818 17:59:46,362 --> 17:59:49,048 and then you commit the\ntransaction at the very end. 18819 17:59:49,048 --> 17:59:51,131 Here, too, though, there's\ngoing to be a downside. 18820 17:59:51,131 --> 17:59:55,243 Typically, the more you use\ntransactions in this way 18821 17:59:55,243 --> 17:59:56,951 potentially the higher\nthe probability is 18822 17:59:56,951 --> 18:00:00,531 that you're going to box someone out or\n 18823 18:00:01,031 --> 18:00:02,824 Because we can't interact\nat the same time. 18824 18:00:02,824 --> 18:00:05,262 Or you might make his request\nfail if he tries to update 18825 18:00:05,262 --> 18:00:07,211 something that's already been updated. 18826 18:00:07,211 --> 18:00:10,061 So you generally want to\nhave as few lines of code 18827 18:00:10,061 --> 18:00:13,182 together in between these transactions\n 18828 18:00:13,182 --> 18:00:16,122 And you go to CVS and you get\nback really fast so as to not 18829 18:00:16,122 --> 18:00:17,891 cause these kind of performance things. 18830 18:00:17,891 --> 18:00:20,411 So things indeed\nescalated quickly today. 18831 18:00:20,411 --> 18:00:23,613 The original goal was just to solve\n 18832 18:00:23,614 --> 18:00:24,822 more effectively than Python. 18833 18:00:24,822 --> 18:00:27,114 But as soon as you have these\nmore powerful techniques 18834 18:00:27,114 --> 18:00:28,866 a whole new set of problems arises. 18835 18:00:28,866 --> 18:00:30,491 Takes practice to get comfortable with. 18836 18:00:30,491 --> 18:00:34,511 But ultimately, this is all leading\n 18837 18:00:34,512 --> 18:00:37,601 of web programming with HTML,\nCSS, and some JavaScript. 18838 18:00:37,601 --> 18:00:40,432 The week after, bringing Python\nand SQL back into the mix. 18839 18:00:40,432 --> 18:00:42,281 So that by term's end,\nwe've really now used 18840 18:00:42,281 --> 18:00:45,101 all of these different languages\nfor what they're best at. 18841 18:00:45,101 --> 18:00:48,184 And over the next few weeks, the goal\n 18842 18:00:48,184 --> 18:00:51,273 and comfortable with what each of\n 18843 18:00:51,273 --> 18:00:52,481 Let's go ahead and wrap here. 18844 18:00:52,482 --> 18:00:53,815 I'll stick around for questions. 18845 18:02:15,171 --> 18:02:19,222 This is CS50, and this\nis already week 8. 18846 18:02:19,222 --> 18:02:21,705 And if we think back to\nthe past several weeks now 18847 18:02:21,705 --> 18:02:24,622 recall that things started pretty\n 18848 18:02:24,622 --> 18:02:27,049 in like week 0, when\nwe were using Scratch 18849 18:02:27,048 --> 18:02:29,631 because with Scratch we had a\nGUI, a graphical user interface. 18850 18:02:29,631 --> 18:02:33,171 So even as we explored variables and\n 18851 18:02:33,171 --> 18:02:36,083 you had kind of a fun environment\n 18852 18:02:36,084 --> 18:02:37,792 And then in week 1,\nwe sort of took a lot 18853 18:02:37,792 --> 18:02:41,692 of that away, when we introduced C, and\n 18854 18:02:41,692 --> 18:02:46,372 because now, all of your programs became\n 18855 18:02:46,372 --> 18:02:49,432 and gone was the mouse, the\nanimations, the menus, and so forth. 18856 18:02:49,432 --> 18:02:51,622 And so now, fast\nforward to week 8, we're 18857 18:02:51,622 --> 18:02:54,952 going to bring those kinds of\nuser interface, UI, elements back 18858 18:02:54,951 --> 18:02:56,511 in the form of web programming. 18859 18:02:56,512 --> 18:02:58,851 And this goes beyond\njust laying out websites. 18860 18:02:58,851 --> 18:03:03,069 This will, to this week and next week,\n 18861 18:03:03,069 --> 18:03:05,362 stuff that we've been doing\nfor the past several weeks 18862 18:03:05,362 --> 18:03:07,972 using Python, using\nSQL, and now introducing 18863 18:03:07,972 --> 18:03:10,972 a couple of other languages,\non the so-called client side 18864 18:03:10,972 --> 18:03:13,461 on your own Mac, your own PC,\nyour own phone, that's going 18865 18:03:13,461 --> 18:03:15,472 to talk to those back-end services. 18866 18:03:15,472 --> 18:03:18,772 So indeed, at this end of\nCS50, does everything rather 18867 18:03:18,771 --> 18:03:22,131 come together into a user interface\nthat's just super familiar. 18868 18:03:22,131 --> 18:03:24,951 All of us are on our phones,\ndesktops, laptops, every day. 18869 18:03:24,951 --> 18:03:28,252 And increasingly, even the mobile\napps that you all are using 18870 18:03:28,252 --> 18:03:32,512 are implemented, not necessarily\n 18871 18:03:32,512 --> 18:03:34,641 if you're familiar with\nthose, but with languages 18872 18:03:34,641 --> 18:03:38,552 called HTML, CSS, and JavaScript,\n 18873 18:03:38,552 --> 18:03:43,372 But before we do that, let's provide a\n 18874 18:03:43,372 --> 18:03:46,822 because indeed, we'll start to look\n 18875 18:03:46,822 --> 18:03:51,105 itself works, albeit quickly, so that\n 18876 18:03:51,105 --> 18:03:54,022 all of this code is running, how you\n 18877 18:03:54,021 --> 18:03:56,752 really, ultimately, after\nCS50, you can learn, by just 18878 18:03:56,752 --> 18:03:59,156 poking around other actual websites. 18879 18:03:59,156 --> 18:04:00,531 So the internet, we're all on it. 18880 18:04:00,531 --> 18:04:05,101 Literally, right now, what\nis it, in your own words? 18881 18:04:07,671 --> 18:04:10,371 It's this utility nowadays, that\nwe all rather take for granted. 18882 18:04:12,654 --> 18:04:14,572 SPEAKER 1: OK, big\nstorage, and indeed, that's 18883 18:04:14,572 --> 18:04:18,451 how the cloud is described, which is\n 18884 18:04:18,451 --> 18:04:21,771 for a whole lot of wires\nand cables and hardware. 18885 18:04:21,771 --> 18:04:24,941 And the internet, other\nformulations of the term, how else? 18886 18:04:24,942 --> 18:04:26,692 AUDIENCE: Bunch of\ndata that we can reach. 18887 18:04:26,692 --> 18:04:28,609 SPEAKER 1: OK, a bunch\nof data that we can all 18888 18:04:28,608 --> 18:04:32,241 reach, by way of being interconnected\n 18889 18:04:32,241 --> 18:04:35,002 And so really, the internet,\ntoo, is a hardware thing. 18890 18:04:35,002 --> 18:04:38,811 There's a whole lot of servers out\n 18891 18:04:38,811 --> 18:04:41,331 via physical cables, via\ninternet service providers 18892 18:04:41,332 --> 18:04:43,382 via wireless connectivity, and the like. 18893 18:04:43,381 --> 18:04:46,161 And once you start to have\nnetworks of networks of networks 18894 18:04:47,362 --> 18:04:50,332 Indeed, Harvard has its own network\n 18895 18:04:50,332 --> 18:04:52,687 and your own home probably\nhas its own network. 18896 18:04:52,686 --> 18:04:54,561 But once you start\nconnecting those networks 18897 18:04:54,561 --> 18:04:58,921 do you get the interconnected network\n 18898 18:04:58,921 --> 18:05:01,551 So there's this whole\nalphabet soup that goes 18899 18:05:01,552 --> 18:05:03,802 with the internet, some of\nwhose acronyms and terms 18900 18:05:03,802 --> 18:05:04,732 you've probably seen before. 18901 18:05:04,732 --> 18:05:06,532 But let's at least peel\nback some of those layers 18902 18:05:06,531 --> 18:05:08,752 and consider what some of\nthe building blocks are. 18903 18:05:08,752 --> 18:05:11,752 So here's a picture of the internet\n 18904 18:05:11,752 --> 18:05:14,182 back in 1969, when it\nwas something called 18905 18:05:14,182 --> 18:05:17,482 ARPANET, from the Advanced\nResearch Projects Agency. 18906 18:05:17,482 --> 18:05:20,872 And the intent, originally, was just\n 18907 18:05:20,872 --> 18:05:25,702 in Utah and California, literally\nservers, or computers, in each 18908 18:05:25,701 --> 18:05:27,891 of those areas, somehow\ninterconnected with wires 18909 18:05:27,891 --> 18:05:29,616 so that people could\nstart to share data. 18910 18:05:29,616 --> 18:05:33,182 A year later, it expanded to\ninclude MIT and Harvard and others. 18911 18:05:33,182 --> 18:05:36,201 And now fast forward to\ntoday, you have a huge number 18912 18:05:36,201 --> 18:05:39,322 of systems around the world\nthat are on this same network. 18913 18:05:39,322 --> 18:05:41,841 And, in fact, if I\njust pull up a web page 18914 18:05:41,841 --> 18:05:44,301 here, that's sort of\nconstantly changing 18915 18:05:44,302 --> 18:05:48,802 a visualization of the internet as\n 18916 18:05:48,802 --> 18:05:52,072 in the abstract, all of these\nlines and interconnections 18917 18:05:52,072 --> 18:05:56,008 represent just how interconnected\nthe world is today. 18918 18:05:56,008 --> 18:05:59,091 And it just means that there's all the\n 18919 18:05:59,091 --> 18:06:02,822 all of the more hardware giving\n 18920 18:06:02,822 --> 18:06:07,402 But if we focus, really, on just\n 18921 18:06:07,402 --> 18:06:11,872 whether back in 1970, or now in 2021,\n 18922 18:06:11,872 --> 18:06:15,385 yes, a server, but a certain type\n 18923 18:06:15,385 --> 18:06:17,302 And a router, as the\nname implies, just routes 18924 18:06:17,302 --> 18:06:21,302 data left to right, top to\nbottom, from one point to another. 18925 18:06:21,302 --> 18:06:25,132 And so there's all these servers here\n 18926 18:06:25,131 --> 18:06:28,651 in Comcast's network, Verizon's\nnetwork, your own home network 18927 18:06:28,652 --> 18:06:31,461 you have your own routers out\nthere, whose purpose in life 18928 18:06:31,461 --> 18:06:34,277 is to take in data and then\ndecide, should I send it this way 18929 18:06:34,277 --> 18:06:36,652 or this way, or this way, so\nto speak, assuming there are 18930 18:06:36,652 --> 18:06:38,512 multiple options with multiple cables. 18931 18:06:38,512 --> 18:06:41,692 You, in your home, probably have just\n 18932 18:06:41,692 --> 18:06:46,162 But certainly, if you're a place like\n 18933 18:06:46,161 --> 18:06:49,221 there's probably a whole\nbunch of interconnections 18934 18:06:49,222 --> 18:06:51,891 that the data can then\ntravel across ultimately. 18935 18:06:51,891 --> 18:06:54,981 So how do we get data\namong these routers? 18936 18:06:54,982 --> 18:06:57,652 For instance, if you want\nto send an email to someone 18937 18:06:57,652 --> 18:07:00,502 at Stanford, in California,\nfrom here, on the East Coast 18938 18:07:00,502 --> 18:07:04,771 or if you want to visit\nwww.stanford.edu, how does your laptop 18939 18:07:04,771 --> 18:07:08,557 your phone, your desktop, actually\n 18940 18:07:08,557 --> 18:07:12,112 Well, essentially, your\nlaptop or phone knows 18941 18:07:12,112 --> 18:07:16,141 when it boots up at the beginning of\n 18942 18:07:16,141 --> 18:07:17,682 the address of that local router is. 18943 18:07:17,682 --> 18:07:20,491 So if you want to send an\nemail from my laptop over here 18944 18:07:20,491 --> 18:07:23,701 my laptop is essentially going to\n 18945 18:07:23,701 --> 18:07:25,711 And then, from there, I\ndon't know, I don't care 18946 18:07:25,711 --> 18:07:27,254 how it gets the rest of the distance. 18947 18:07:27,254 --> 18:07:29,822 But hopefully, within some\nsmall number of steps later 18948 18:07:29,822 --> 18:07:32,461 Harvard's router is going to\nsend it to maybe Boston's router 18949 18:07:32,461 --> 18:07:34,586 is going to send it to\nCalifornia's router is going 18950 18:07:34,586 --> 18:07:37,771 to send it to Stanford's router, until\n 18951 18:07:38,372 --> 18:07:41,822 And we can depict this, actually,\nhow about a bit playfully. 18952 18:07:41,822 --> 18:07:44,072 Thankfully, the course's\nstaff kindly volunteered 18953 18:07:44,072 --> 18:07:48,792 to create a visualization for\nthis, using a familiar technology. 18954 18:07:48,792 --> 18:07:52,322 So here we have some of our TFs\n 18955 18:07:52,322 --> 18:07:55,472 Let me go ahead and full\nscreen this window here. 18956 18:07:55,472 --> 18:07:58,391 Give me just a moment to\npull it up on my screen here. 18957 18:07:58,391 --> 18:08:03,182 And we'll consider what happens if we\n 18958 18:08:03,182 --> 18:08:07,502 from one person or router,\nnamely Phyllis in this case 18959 18:08:07,502 --> 18:08:10,561 in the bottom right hand corner,\nup to Brian, in this case 18960 18:08:10,561 --> 18:08:11,831 in the top left hand corner. 18961 18:08:11,832 --> 18:08:14,461 So each of the staff members\nhere represents exactly one 18962 18:08:14,461 --> 18:08:17,051 of these routers on the internet. 18963 18:08:47,612 --> 18:08:49,891 It actually took us a\nsignificant number of attempts 18964 18:08:49,891 --> 18:08:51,481 to get that ultimately right. 18965 18:08:51,482 --> 18:08:54,632 So when, what was it the\nstaff were all passing here? 18966 18:08:54,631 --> 18:08:57,714 Here we have just, physically, what\n 18967 18:08:57,714 --> 18:08:59,881 So Phyllis started with an\nenvelope, inside of which 18968 18:08:59,881 --> 18:09:01,951 was that email, presumably,\non the East Coast 18969 18:09:01,951 --> 18:09:05,136 and she wanted to send it to Brian on\n 18970 18:09:05,137 --> 18:09:08,012 And so she had all of these different\n 18971 18:09:08,012 --> 18:09:11,112 between her and point B, namely Brian. 18972 18:09:11,112 --> 18:09:14,671 She could go up, down, in her case, and\n 18973 18:09:14,671 --> 18:09:17,461 could go up, down, left, or right,\n 18974 18:09:17,461 --> 18:09:19,442 And long story short,\nthere's algorithms that 18975 18:09:19,442 --> 18:09:22,812 figure out how you decide\nto send a packet up, down 18976 18:09:22,811 --> 18:09:24,661 left, or right, so to speak. 18977 18:09:24,661 --> 18:09:29,716 But they do so by taking an input, and\n 18978 18:09:29,716 --> 18:09:32,341 And there's at least a couple of\nthings on the outside of this 18979 18:09:32,341 --> 18:09:35,281 because all of these routers and,\n 18980 18:09:35,281 --> 18:09:38,012 and phones these days,\nspeak something called 18981 18:09:38,012 --> 18:09:41,132 TCP/IP, a set of\nacronyms you've probably 18982 18:09:41,131 --> 18:09:44,011 seen somewhere on your\nphone, your Mac or PC 18983 18:09:44,012 --> 18:09:48,422 in print somewhere, which refers\n 18984 18:09:48,421 --> 18:09:51,194 that computers use to\ninter-communicate these days. 18985 18:09:52,112 --> 18:09:54,631 A protocol is like a set\nof rules, that you behave. 18986 18:09:54,631 --> 18:09:57,091 In healthier times, I might\nextend my hand and someone 18987 18:09:57,091 --> 18:10:00,631 like Carter might extend his hand,\n 18988 18:10:00,631 --> 18:10:03,871 on a human protocol of like\nliterally physically shaking hands. 18989 18:10:03,872 --> 18:10:07,082 Nowadays, we have mask protocols,\nwhereby what you need to do 18990 18:10:08,591 --> 18:10:11,641 But that, too, is just a set of rules\n 18991 18:10:11,641 --> 18:10:13,951 that's somewhere\nstandardized and documented. 18992 18:10:13,951 --> 18:10:16,711 So computers use protocols\nall the time to govern 18993 18:10:16,711 --> 18:10:19,622 how they are sending information\nand receiving information. 18994 18:10:19,622 --> 18:10:24,482 And TCP and IP are two such protocols\n 18995 18:10:24,482 --> 18:10:27,212 What TCP/IP tells someone\nlike Phyllis to do 18996 18:10:27,211 --> 18:10:31,152 if she wants to send an email to Brian,\n 18997 18:10:31,902 --> 18:10:37,082 But on the outside of that virtual\n 18998 18:10:37,082 --> 18:10:41,132 And I'll describe this as destination\n 18999 18:10:41,131 --> 18:10:44,011 just like in our human world,\nyou would write the destination 19000 18:10:45,332 --> 18:10:49,202 And then she's going to put her own\n 19001 18:10:49,201 --> 18:10:51,750 corner, just like you, the\nsender, would put your own source 19002 18:10:51,750 --> 18:10:53,171 address in the human world. 19003 18:10:53,171 --> 18:10:56,432 But, instead of these addresses\n 19004 18:10:56,432 --> 18:11:01,141 Cambridge, Massachusetts 02138, USA,\n 19005 18:11:01,141 --> 18:11:05,641 on the internet have unique addresses\n 19006 18:11:05,641 --> 18:11:08,192 And an IP address is\njust a numeric identifier 19007 18:11:08,192 --> 18:11:11,461 on the internet, that allows\ncomputers, like Phyllis and Brian 19008 18:11:11,461 --> 18:11:14,432 to address these envelopes\nto and from each other. 19009 18:11:14,432 --> 18:11:16,741 And you've probably seen\nthe format at some point. 19010 18:11:16,741 --> 18:11:20,311 Typically, the format of IP\naddresses is something dot something 19011 18:11:20,311 --> 18:11:22,261 dot something dot something. 19012 18:11:22,262 --> 18:11:24,872 Each of those somethings,\nrepresented here with a hash symbol 19013 18:11:24,872 --> 18:11:29,852 is a number from 0 through 255. 19014 18:11:29,851 --> 18:11:32,911 And, based on that little\nhint, if each of these hashes 19015 18:11:32,911 --> 18:11:36,631 represents a number from 0\nto 255, each of those hashes 19016 18:11:36,631 --> 18:11:39,542 is represented with\nhow many bytes or bits? 19017 18:11:39,542 --> 18:11:43,442 Eight bits or one byte, which is to\n 19018 18:11:43,442 --> 18:11:47,072 an IP address must use\n32 bits or 4 bytes 19019 18:11:47,072 --> 18:11:50,641 if we rewind now to some of the\n 19020 18:11:50,641 --> 18:11:52,991 And what that means is,\nat least at a glance 19021 18:11:52,991 --> 18:11:57,153 it looks like we have 4 billion some\n 19022 18:11:57,154 --> 18:11:58,862 Now, unfortunately,\nthere's a huge number 19023 18:11:58,862 --> 18:12:01,772 of humans in the world these days,\n 19024 18:12:01,771 --> 18:12:05,221 have multiple devices, certainly\n 19025 18:12:05,222 --> 18:12:08,974 a laptop, and a phone, and you have\n 19026 18:12:08,974 --> 18:12:10,391 all of which need to be addressed. 19027 18:12:10,391 --> 18:12:12,332 So there's another type\nof IP address that's 19028 18:12:12,332 --> 18:12:14,012 starting to be used more commonly. 19029 18:12:16,411 --> 18:12:19,232 There's also version 6\nwhich, instead of 32 bits 19030 18:12:19,232 --> 18:12:23,552 uses 128 bits, which gives us a\n 19031 18:12:23,551 --> 18:12:27,872 for computers, so we can at least handle\n 19032 18:12:28,812 --> 18:12:31,772 So this is to say, what ultimately\nis going on this envelope 19033 18:12:31,771 --> 18:12:35,941 is the destination address, that is\n 19034 18:12:35,941 --> 18:12:40,021 address, that is Phyllis's IP address,\n 19035 18:12:40,021 --> 18:12:43,051 to point B, and if need\nbe, back, by just flipping 19036 18:12:43,051 --> 18:12:44,641 the source and the destination. 19037 18:12:44,642 --> 18:12:49,652 But on the internet, you presumably know\n 19038 18:12:49,652 --> 18:12:53,191 There's web servers, there's chat\n 19039 18:12:53,191 --> 18:12:56,191 Like there's all of these different\n 19040 18:12:56,191 --> 18:12:58,771 And so, when Brian\nreceives that envelope 19041 18:12:58,771 --> 18:13:05,432 how does he know it's an email, versus\n 19042 18:13:05,432 --> 18:13:07,622 versus something else altogether. 19043 18:13:07,622 --> 18:13:10,771 Well, it turns out that we\ncan look at the other part 19044 18:13:10,771 --> 18:13:14,281 of this acronym, the TCP in TCP/IP. 19045 18:13:14,282 --> 18:13:17,172 And what TCP allows us\nto do, for instance 19046 18:13:17,172 --> 18:13:18,822 is specify a couple of things. 19047 18:13:18,822 --> 18:13:23,792 One, the type of service whose\n 19048 18:13:23,792 --> 18:13:26,881 it does this with a numeric identifier. 19049 18:13:26,881 --> 18:13:31,652 And I'm going to go ahead and write down\n 19050 18:13:31,652 --> 18:13:35,222 And I'm going to write that in the\n 19051 18:13:35,221 --> 18:13:37,381 So technically, now,\nwhat's on this envelope 19052 18:13:37,381 --> 18:13:40,111 is not just the addresses,\nbut also a unique number 19053 18:13:40,111 --> 18:13:45,182 that represents what kind of service\n 19054 18:13:45,182 --> 18:13:48,792 whether it's email, or web traffic,\nor Skype, or something else. 19055 18:13:48,792 --> 18:13:53,342 These numbers are standardized, and here\n 19056 18:13:53,342 --> 18:13:56,192 not even in the context of email,\nbut in the context of the web. 19057 18:13:56,191 --> 18:13:59,131 Port 80 is typically used\nwhenever an envelope contains 19058 18:13:59,131 --> 18:14:03,361 a web page, or a request\ntherefor, or the number 443 19059 18:14:03,361 --> 18:14:07,201 when that request is actually\n 19060 18:14:07,202 --> 18:14:11,192 know, in URLs, known as HTTPS,\n 19061 18:14:11,191 --> 18:14:13,081 More on what the HTTP means later. 19062 18:14:13,081 --> 18:14:16,955 If it's email, the number\nmight be 25 or 465, or 587. 19063 18:14:16,955 --> 18:14:19,872 These are the kinds of things you\n 19064 18:14:19,872 --> 18:14:24,271 But if you've ever had to configure,\nlike, Outlook or even Gmail 19065 18:14:24,271 --> 18:14:26,671 to talk to another account,\nyou might very well 19066 18:14:26,672 --> 18:14:30,902 have seen these numbers, by typing\n 19067 18:14:30,902 --> 18:14:33,892 and then a number, which is only to\n 19068 18:14:33,892 --> 18:14:35,642 But they're typically\nnot things you and I 19069 18:14:35,642 --> 18:14:38,582 have to care about, because\nservers and computers nowadays 19070 18:14:38,581 --> 18:14:40,811 automate much of this process. 19071 18:14:40,812 --> 18:14:45,211 But that's all it takes, ultimately, for\n 19072 18:14:45,211 --> 18:14:47,042 But what if it's a really big message? 19073 18:14:47,042 --> 18:14:50,351 If it's a short email, It might\n 19074 18:14:50,941 --> 18:14:54,241 But suppose that Phyllis wants\nto send Brian a picture of a cat 19075 18:14:54,241 --> 18:14:56,581 like this, or worse, a video of a cat. 19076 18:14:56,581 --> 18:15:00,811 It would be kind of inequitable\nif no one else could do anything 19077 18:15:00,812 --> 18:15:03,062 on the internet, just\nbecause Phyllis wants 19078 18:15:03,062 --> 18:15:06,601 to send Brian a really big picture,\na really big video of a cat. 19079 18:15:06,601 --> 18:15:10,351 It would be nice if we could kind\n 19080 18:15:10,351 --> 18:15:13,529 across these routers, so that\nwe can give a little bit of time 19081 18:15:13,529 --> 18:15:15,572 to Phyllis, a little bit\nof time to someone else 19082 18:15:15,572 --> 18:15:18,779 a little bit of time to someone else,\n 19083 18:15:19,892 --> 18:15:25,472 But in terms of fairness, she\ndoesn't monopolize the bandwidth 19084 18:15:27,842 --> 18:15:31,982 And this, then, allows us to\ndo one other feature of TCP/IP 19085 18:15:31,982 --> 18:15:35,072 which is fragmentation,\nwhere we can temporarily 19086 18:15:35,072 --> 18:15:37,952 and Phyllis's computer would\ndo this automatically, fragment 19087 18:15:37,952 --> 18:15:41,582 the big packet in question,\nor the big file in question 19088 18:15:41,581 --> 18:15:46,561 and then use, not just a single\n 19089 18:15:48,369 --> 18:15:50,161 If we do that, though,\nwe're probably going 19090 18:15:50,161 --> 18:15:54,031 to need one other piece of information,\n 19091 18:15:54,032 --> 18:15:57,332 Like, if you were implementing this,\n 19092 18:15:57,331 --> 18:16:00,301 into four parts, like,\nintuitively, what might you 19093 18:16:00,301 --> 18:16:03,911 want to put virtually on the\noutside of this envelope now? 19094 18:16:05,259 --> 18:16:06,842 SPEAKER 1: The order of them, somehow. 19095 18:16:06,842 --> 18:16:10,472 So probably something like part\none of four, part two of four 19096 18:16:10,471 --> 18:16:12,161 part three of four, and so forth. 19097 18:16:12,161 --> 18:16:14,792 So I'm going to write one more thing\n 19098 18:16:15,301 --> 18:16:17,551 I put some kind of\nsequence number, that's 19099 18:16:17,551 --> 18:16:20,191 just a little bit of a\nclue to Brian, to know 19100 18:16:20,191 --> 18:16:22,171 in what order to\nreassemble these things. 19101 18:16:22,172 --> 18:16:25,202 And even more powerfully\nthan that, this actually 19102 18:16:25,202 --> 18:16:29,252 gives us this simple primitive of\n 19103 18:16:30,452 --> 18:16:35,881 If Brian receives envelopes like these,\n 19104 18:16:35,881 --> 18:16:39,691 field, what other feature\ndoes TCP apparently 19105 18:16:39,691 --> 18:16:42,542 enable Brian and Phyllis to implement? 19106 18:16:44,282 --> 18:16:46,612 But it's not just the\nordering of the packets. 19107 18:16:46,611 --> 18:16:50,701 What else might be useful about\nputting numbers on these things 19108 18:16:55,187 --> 18:16:56,932 AUDIENCE: How about if you like missed. 19109 18:16:56,932 --> 18:16:58,851 SPEAKER 1: If you missed something\nthat was intended to be sent 19110 18:16:59,851 --> 18:17:04,110 So short answer, exactly, yes, TCP,\n 19111 18:17:04,110 --> 18:17:06,652 that we\'re including, can quote\nunquote "guarantee" delivery. 19112 18:17:07,161 --> 18:17:10,341 Because if Brian receives one\nout of four, two out of four 19113 18:17:10,342 --> 18:17:12,562 four out of four, but\nnot three out of four 19114 18:17:12,562 --> 18:17:16,402 he now knows, predictably, that\n 19115 18:17:17,911 --> 18:17:21,801 And so this is why pretty much\nalways, if you receive an email 19116 18:17:21,801 --> 18:17:24,411 you either receive the whole\nthing, or nothing at all. 19117 18:17:24,411 --> 18:17:27,741 Like sentences and words and\nparagraphs should never really 19118 18:17:29,361 --> 18:17:31,478 Or if you download a\nphotograph on the web 19119 18:17:31,479 --> 18:17:33,562 it shouldn't just have a\nblank hole in the middle 19120 18:17:33,562 --> 18:17:36,652 just because that packet of\ninformation happened to be lost. 19121 18:17:36,652 --> 18:17:40,851 TCP, if it is the protocol being used to\n 19122 18:17:40,851 --> 18:17:44,792 ensures that it either all gets there,\n 19123 18:17:44,792 --> 18:17:48,112 So this is an important property,\nbut, just as a teaser there's 19124 18:17:49,312 --> 18:17:52,732 There's something called UDP,\nwhich is an alternative to TCP 19125 18:17:52,732 --> 18:17:54,472 that doesn't guarantee delivery. 19126 18:17:54,471 --> 18:17:58,072 And just as a taste of why you might\n 19127 18:17:58,072 --> 18:18:03,366 maybe you're watching like a streaming\n 19128 18:18:03,366 --> 18:18:05,241 You probably don't\nnecessarily want the thing 19129 18:18:05,241 --> 18:18:08,476 to buffer and buffer and buffer, just\n 19130 18:18:08,476 --> 18:18:10,351 because you're going to\nstart to miss things. 19131 18:18:10,351 --> 18:18:12,471 And then you're going to be the\nonly one in the world watching 19132 18:18:12,471 --> 18:18:15,849 the game that ended 20 minutes ago, when\n 19133 18:18:15,850 --> 18:18:18,142 Similarly for a voice call,\nit would be really annoying 19134 18:18:18,142 --> 18:18:19,942 if our voice is constantly buffered. 19135 18:18:19,941 --> 18:18:22,678 So UDP might be a good\nprotocol for making sure 19136 18:18:22,679 --> 18:18:25,762 that, even if the person on the other\n 19137 18:18:26,991 --> 18:18:29,781 It's not pausing and\nresending and resending 19138 18:18:29,782 --> 18:18:33,961 because that would really slow down\n 19139 18:18:33,961 --> 18:18:36,861 So, in short, IP handles the\naddressing of these packets 19140 18:18:36,861 --> 18:18:40,341 and standardizes numbers that every\n 19141 18:18:40,342 --> 18:18:44,542 and TCP handles the standardization\nof like what services 19142 18:18:44,542 --> 18:18:50,661 can be used, between points A and\n 19143 18:18:50,661 --> 18:18:54,652 but presumably, when Phyllis\nsends a message to Brian 19144 18:18:54,652 --> 18:18:56,782 she doesn't really know\nand probably shouldn't 19145 18:18:56,782 --> 18:18:58,912 care what his IP address is, right? 19146 18:18:58,911 --> 18:19:01,491 These days it's, like, I don't\nknow most of the phone numbers 19147 18:19:02,661 --> 18:19:04,614 I instead look them up in some way. 19148 18:19:04,614 --> 18:19:07,072 And, indeed, when you visit a\nwebsite, what do you type in? 19149 18:19:07,072 --> 18:19:10,048 It's typically not something\ndot something dot something dot 19150 18:19:10,048 --> 18:19:12,298 something, where each of\nthose somethings is a number. 19151 18:19:12,298 --> 18:19:14,182 What do you typically\ntype in to a browser? 19152 18:19:15,622 --> 18:19:20,031 Something like Stanford.edu,\nHarvard.edu, Yale.edu, gmail.com 19153 18:19:20,031 --> 18:19:22,072 or any other such domain name. 19154 18:19:22,072 --> 18:19:24,622 And so, thankfully,\nthere's another system 19155 18:19:24,622 --> 18:19:29,551 on the internet, one more acronym for\n 19156 18:19:29,551 --> 18:19:33,742 And pretty much every network on the\n 19157 18:19:33,741 --> 18:19:37,838 your own home network, somewhere,\nsomehow has a DNS server. 19158 18:19:37,839 --> 18:19:39,922 You probably didn't have\nto configure it yourself. 19159 18:19:39,922 --> 18:19:44,342 Someone else did, your campus, your\n 19160 18:19:44,342 --> 18:19:48,112 But there is some server connected\n 19161 18:19:48,111 --> 18:19:52,701 via wires or wirelessly, that just\n 19162 18:19:52,702 --> 18:19:55,672 a big spreadsheet, if you\nwill, or, if you prefer 19163 18:19:55,672 --> 18:19:59,992 a hash table, that has at least\ntwo columns of keys and values 19164 18:20:00,812 --> 18:20:02,241 Where on the left hand\nside is what we'll 19165 18:20:02,241 --> 18:20:04,262 call domain name,\nsomething like Harvard.edu 19166 18:20:04,262 --> 18:20:08,902 Yale.edu, an IP address on the\nright hand side, that is to say 19167 18:20:08,902 --> 18:20:13,612 a DNS server's purpose in life\nis just to translate domain names 19168 18:20:14,896 --> 18:20:17,271 And vice versa, if you want\nto go in the other direction 19169 18:20:17,271 --> 18:20:22,072 and technically, just to be precise, it\n 19170 18:20:23,432 --> 18:20:25,441 And we'll see what those\nare in just a moment. 19171 18:20:25,441 --> 18:20:27,652 But again, all of this just\nkind of happens magically 19172 18:20:27,652 --> 18:20:29,402 when you turn on your\nphone or your laptop 19173 18:20:29,402 --> 18:20:33,292 today, because all of these things\n 19174 18:20:33,292 --> 18:20:38,752 So how can we actually start to\n 19175 18:20:38,751 --> 18:20:44,911 Well, let's go ahead and poke around,\n 19176 18:20:44,911 --> 18:20:48,531 Let's see what we can actually do\n 19177 18:20:48,532 --> 18:20:52,672 If we now have the ability to\nmove data from point A to point B 19178 18:20:52,672 --> 18:20:55,252 and what can be in that envelope\ncould be, yes, an email 19179 18:20:55,251 --> 18:20:58,011 but today, onward, it's really\ngoing to be web content. 19180 18:20:58,012 --> 18:21:00,202 There's going to be content\nthat you're requesting 19181 18:21:00,202 --> 18:21:01,584 like give me today's home page. 19182 18:21:01,584 --> 18:21:03,292 And there's content\nyou're sending, which 19183 18:21:03,292 --> 18:21:05,512 would be the contents of\nthat actual home page. 19184 18:21:05,512 --> 18:21:10,642 And so, just to go one level deeper,\n 19185 18:21:10,642 --> 18:21:14,602 are getting from point A to\npoint B using TCP/IP, let's 19186 18:21:14,601 --> 18:21:19,221 put something specific inside of them,\n 19187 18:21:19,221 --> 18:21:24,149 but something called HTTP, which\n 19188 18:21:24,149 --> 18:21:25,941 You've seen this for\ndecades now, probably 19189 18:21:25,941 --> 18:21:29,331 in the form of URLs, so much so that you\n 19190 18:21:29,331 --> 18:21:31,521 Your browser just adds\nit for you automatically 19191 18:21:31,521 --> 18:21:35,101 and you just type in Harvard.edu,\nor Yale.edu, or the like. 19192 18:21:35,101 --> 18:21:38,391 But HTTP is just a final\nprotocol that we'll 19193 18:21:38,392 --> 18:21:42,412 talk about here, that just\nstandardizes how web browsers and web 19194 18:21:44,611 --> 18:21:47,601 So this is a distinction now\nbetween the internet and the web. 19195 18:21:47,601 --> 18:21:50,001 The internet is really like\nthe low-level plumbing 19196 18:21:50,001 --> 18:21:53,091 all of the cables, all of a\ntechnology that just moves packets 19197 18:21:53,092 --> 18:21:57,322 from left to right, right to left, top\n 19198 18:21:57,322 --> 18:22:01,941 to point B. You can do anything you\n 19199 18:22:01,941 --> 18:22:06,121 email and web and video and chat\nand gaming, and all of that. 19200 18:22:06,122 --> 18:22:09,652 So HTTP, or the web,\nis just one application 19201 18:22:09,652 --> 18:22:13,695 that is conceptually on top of,\nbuilt on top of the internet. 19202 18:22:13,695 --> 18:22:15,862 Once you take for granted\nthat there is an internet 19203 18:22:15,861 --> 18:22:17,451 you can do really\ninteresting things with it 19204 18:22:17,452 --> 18:22:19,942 just like in our physical world,\nonce you have electricity 19205 18:22:19,941 --> 18:22:22,941 you can just assume you can do really\n 19206 18:22:22,941 --> 18:22:25,101 without even knowing\nor caring how it works. 19207 18:22:25,101 --> 18:22:28,801 But now that you'll be\nprogramming for the web 19208 18:22:28,801 --> 18:22:32,131 it's useful to understand how\nsome of these things indeed work. 19209 18:22:32,131 --> 18:22:36,202 So let's take a peek at the\nformat of the things that 19210 18:22:37,402 --> 18:22:39,922 These days, it's usually\nactually HTTPS that's 19211 18:22:39,922 --> 18:22:42,172 in play, where, again,\nthe S just means secure. 19212 18:22:42,172 --> 18:22:47,092 More on that later, but the HTTP is\n 19213 18:22:47,092 --> 18:22:48,472 go inside of these envelopes. 19214 18:22:48,471 --> 18:22:52,281 And wonderfully, it's just\ntextual information, typically. 19215 18:22:52,282 --> 18:22:56,992 There is a simple text format\nthat humans decided on years ago 19216 18:22:56,991 --> 18:23:00,441 that goes inside of these\nenvelopes, that tells a browser how 19217 18:23:00,441 --> 18:23:04,762 to request information from a server,\n 19218 18:23:04,762 --> 18:23:06,392 to that client with information. 19219 18:23:06,392 --> 18:23:12,592 So here's, for instance, a canonical\n 19220 18:23:12,592 --> 18:23:14,362 What might you see at the end of this? 19221 18:23:14,361 --> 18:23:15,829 You might sometimes see a slash. 19222 18:23:15,830 --> 18:23:18,622 Browsers nowadays kind of simplify\n 19223 18:23:18,622 --> 18:23:21,771 But slash, as we'll see, just\nrepresents like the default 19224 18:23:21,771 --> 18:23:24,652 folder, the root of the\nweb server's hard drive 19225 18:23:24,652 --> 18:23:26,077 like whatever the base is of it. 19226 18:23:26,077 --> 18:23:32,182 It's like C colon backslash on\n 19227 18:23:32,182 --> 18:23:34,432 But a URL can have more than that. 19228 18:23:34,432 --> 18:23:36,982 It can have slash path,\nwhere path is just a word 19229 18:23:36,982 --> 18:23:40,552 or multiple words, that sort of\n 19230 18:23:40,551 --> 18:23:43,072 That path could actually be\na specific file, we'll see 19231 18:23:43,072 --> 18:23:45,262 like something called file.html. 19232 18:23:45,262 --> 18:23:48,682 More on HTML in just a bit, or\nit can even be slash folder 19233 18:23:48,682 --> 18:23:52,911 maybe with another slash, or\nmaybe it can be /folder/file.html. 19234 18:23:52,911 --> 18:23:57,081 Now these days Safari, and even Chrome\n 19235 18:23:57,081 --> 18:24:00,621 are in the habit of trying to hide\n 19236 18:24:02,361 --> 18:24:05,121 Ultimately, though, it'll\nbe useful to understand 19237 18:24:05,122 --> 18:24:08,812 what URLs you're at, because\nit maps directly to the code 19238 18:24:08,812 --> 18:24:10,642 that we're ultimately going to write. 19239 18:24:10,642 --> 18:24:13,282 But this is only to say that\nall this stuff in yellow 19240 18:24:13,282 --> 18:24:18,322 refers to, presumably, a specific\nfile and/or folder on the web 19241 18:24:18,322 --> 18:24:20,214 server, on which you're programming. 19242 18:24:21,172 --> 18:24:26,002 Example.com, this is the domain\n 19243 18:24:26,001 --> 18:24:28,621 Example.com is the\nso-called domain name. 19244 18:24:28,622 --> 18:24:33,532 This whole thing, www.example.com,\n 19245 18:24:33,532 --> 18:24:37,432 And what the WW is referring\nto is specifically the name 19246 18:24:37,432 --> 18:24:40,292 of a specific server in that domain. 19247 18:24:40,292 --> 18:24:44,842 So back in the day, there was\na www.example.com web server. 19248 18:24:44,842 --> 18:24:48,442 There might have been a\nmail.example.com mail server. 19249 18:24:48,441 --> 18:24:51,501 There might have been a\nchat.example.com chat server. 19250 18:24:51,501 --> 18:24:56,331 Nowadays, this hostname, or\nsubdomain, depending on the context 19251 18:24:56,331 --> 18:24:58,581 can actually refer to a whole\nbunch of servers, right? 19252 18:24:58,581 --> 18:25:01,792 When you go to www.facebook.com,\nthat's not one server 19253 18:25:01,792 --> 18:25:03,652 that's thousands of servers nowadays. 19254 18:25:03,652 --> 18:25:05,631 So long story short,\nthere's technology that 19255 18:25:05,631 --> 18:25:08,122 somehow get your data\nto one of those servers 19256 18:25:08,122 --> 18:25:11,491 but this whole thing is what we\n 19257 18:25:11,491 --> 18:25:13,822 This thing here, hostname,\nin the context of an email 19258 18:25:13,822 --> 18:25:16,702 address it might alternatively\nbe called a subdomain. 19259 18:25:16,702 --> 18:25:20,062 This thing here, top\nlevel domain, you probably 19260 18:25:20,062 --> 18:25:23,422 know that dot com means commercial,\n 19261 18:25:23,422 --> 18:25:25,762 Dot org is similar, dot net. 19262 18:25:25,762 --> 18:25:29,182 Some of them are a bit restricted,\n 19263 18:25:29,182 --> 18:25:31,711 dot edu is just for accredited\neducational institutions. 19264 18:25:31,711 --> 18:25:35,211 But there are hundreds, if\nnot more, top level domains 19265 18:25:35,211 --> 18:25:37,702 nowadays, some more popular than others. 19266 18:25:37,702 --> 18:25:41,542 CS50's tools, for instance, use CS50.io. 19267 18:25:41,542 --> 18:25:44,182 IO sort of connotes input-output. 19268 18:25:44,182 --> 18:25:49,131 It actually belongs, though, to\n 19269 18:25:49,131 --> 18:25:55,366 whose country code is .io, and you see\n 19270 18:25:56,241 --> 18:26:00,399 Indeed, it's something.uk,\nsomething.jp, and the like typically 19271 18:26:01,191 --> 18:26:04,042 But some of them have been\nrather co-opted, .tv as well 19272 18:26:04,042 --> 18:26:06,932 because they have these\nmeanings in English as well. 19273 18:26:06,932 --> 18:26:08,991 Lastly, this is what\nwe'll call the protocol. 19274 18:26:08,991 --> 18:26:13,042 That specifies how the server uses\n 19275 18:26:13,042 --> 18:26:16,351 to point B. So what is\ninside of this envelope? 19276 18:26:16,351 --> 18:26:18,622 Let's now start poking\naround a little bit more. 19277 18:26:18,622 --> 18:26:20,211 What is inside of this envelope? 19278 18:26:20,211 --> 18:26:23,782 It's essentially, for our\npurposes today, one of two verbs 19279 18:26:25,792 --> 18:26:28,612 And if any of you have dabbled\n 19280 18:26:28,611 --> 18:26:30,569 you might have seen some\nof these terms before. 19281 18:26:30,570 --> 18:26:34,702 But these two verbs describe\njust how to send information 19282 18:26:36,952 --> 18:26:39,172 Long story short, more\non this next week 19283 18:26:39,172 --> 18:26:43,472 GET means put any user input\nin the URL, POST means hide it 19284 18:26:43,471 --> 18:26:46,641 so that things you're searching for,\n 19285 18:26:46,642 --> 18:26:49,634 usernames and passwords you're\n 19286 18:26:49,634 --> 18:26:51,592 and are therefore visible\nto anyone with access 19287 18:26:51,592 --> 18:26:53,672 to your computer and\nyour search history 19288 18:26:53,672 --> 18:26:57,802 but rather they're somehow provided\n 19289 18:26:57,801 --> 18:27:00,292 But for now, we'll focus\nalmost entirely on GET 19290 18:27:00,292 --> 18:27:03,741 which is perhaps the most common\n 19291 18:27:03,741 --> 18:27:05,251 And what we're going to do is this. 19292 18:27:05,251 --> 18:27:07,461 Let me switch over just\nto a blank screen here. 19293 18:27:07,461 --> 18:27:11,182 And if we assume that little\nold me is this laptop here 19294 18:27:11,182 --> 18:27:16,191 and I'm connected to the cloud, and\n 19295 18:27:16,191 --> 18:27:20,182 want to request the web page\nof, Harvard.edu or Yale.edu 19296 18:27:20,182 --> 18:27:22,982 it's really going to\nbe a two-step process. 19297 18:27:22,982 --> 18:27:27,892 There's going to be a request,\n 19298 18:27:27,892 --> 18:27:29,752 and then, hopefully,\nthe server that hears 19299 18:27:29,751 --> 18:27:34,011 that request is going to reply with\n 19300 18:27:34,012 --> 18:27:37,042 And other terms that are\nrelevant here, is my laptop 19301 18:27:37,042 --> 18:27:40,642 is the so-called client,\nHarvard.edu, Yale.edu, whatever 19302 18:27:40,642 --> 18:27:42,352 it is, is the so-called server. 19303 18:27:42,351 --> 18:27:45,441 And just like in a restaurant, where\n 19304 18:27:45,441 --> 18:27:47,091 the server might bring it to you. 19305 18:27:47,092 --> 18:27:49,372 It's, again, that kind of\nbidirectional relationship. 19306 18:27:49,372 --> 18:27:54,381 One request, one response, for\neach such web page we request. 19307 18:27:54,381 --> 18:27:58,042 All right, so what's inside these\n 19308 18:27:58,042 --> 18:28:01,012 Well, this arrow, this line I\njust drew from left to right 19309 18:28:01,012 --> 18:28:05,301 representing the request, technically\n 19310 18:28:05,301 --> 18:28:08,031 When you visit a web\npage, using your browser 19311 18:28:08,032 --> 18:28:11,452 on your phone, laptop, or desktop,\n 19312 18:28:11,452 --> 18:28:14,794 and the textual message your Mac or PC\n 19313 18:28:14,793 --> 18:28:16,251 looks a little something like this. 19314 18:28:16,251 --> 18:28:20,111 The verb GET, the URL, or rather\nthe path that you want to get 19315 18:28:20,111 --> 18:28:22,941 slash represents the\ndefault page on the website. 19316 18:28:22,941 --> 18:28:27,682 HTTP/1.1 is just some mention of\n 19317 18:28:27,682 --> 18:28:31,221 Now we're up to version 2, and\n 19318 18:28:31,221 --> 18:28:35,811 And the envelope contains some\n 19319 18:28:35,812 --> 18:28:37,502 the fully qualified domain name. 19320 18:28:37,501 --> 18:28:42,331 This is because single servers can\n 19321 18:28:42,331 --> 18:28:46,011 If you're using Squarespace or Wix or\n 19322 18:28:46,012 --> 18:28:49,282 nowadays, you don't get your own\npersonal server, most likely. 19323 18:28:49,282 --> 18:28:52,652 You're on the same server as\n 19324 18:28:52,652 --> 18:28:56,241 But when your customers,\nyour users' browsers 19325 18:28:56,241 --> 18:29:00,471 include a little mention of your\n 19326 18:29:00,471 --> 18:29:02,841 name in the envelope,\nSquarespace and Wix just 19327 18:29:02,842 --> 18:29:06,442 know to send it to your web page or\n 19328 18:29:07,142 --> 18:29:09,032 Dot dot dot, there's\nsome other stuff there. 19329 18:29:09,032 --> 18:29:12,532 But that's really the essence\nof what's in these requests. 19330 18:29:12,532 --> 18:29:16,312 Hopefully, then, when your browser\n 19331 18:29:17,361 --> 18:29:21,951 Well, hopefully, a response\nthat looks like this, HTTP/1.1 19332 18:29:21,952 --> 18:29:25,822 so the same version, some\nstatus code, like a number 200 19333 18:29:25,822 --> 18:29:30,021 and then literally a short phrase like\n 19334 18:29:31,682 --> 18:29:35,062 Then it contains some other\n 19335 18:29:35,872 --> 18:29:37,789 And we'll see that this,\ntoo, is standardized. 19336 18:29:37,789 --> 18:29:41,932 Text/HTML means here comes some\n 19337 18:29:41,932 --> 18:29:48,142 It could instead be image/jpeg\nor Image/png, or video/mp4 19338 18:29:48,142 --> 18:29:51,982 there are these different content\n 19339 18:29:51,982 --> 18:29:54,711 that uniquely identify types\nof files, that come back 19340 18:29:54,711 --> 18:29:57,322 similar in spirit to file\nextensions, but a little more 19341 18:29:59,232 --> 18:30:00,982 Then there's some more\nstuff, dot dot dot. 19342 18:30:00,982 --> 18:30:05,842 But in general, what you see here, are\n 19343 18:30:05,842 --> 18:30:09,652 These keys and values are\notherwise known as HTTP headers. 19344 18:30:09,652 --> 18:30:14,092 And your browser has been sending\n 19345 18:30:14,092 --> 18:30:16,422 And, indeed, we can see\nthis right now ourselves. 19346 18:30:16,422 --> 18:30:20,422 Let me go over, in just a\nsecond, to Chrome on my computer 19347 18:30:20,422 --> 18:30:23,552 though you can do this kind of\n 19348 18:30:23,551 --> 18:30:28,641 I'll go ahead and visit\nHTTP://Harvard.edu, Enter. 19349 18:30:28,642 --> 18:30:31,459 And, voila, I'm at Harvard's\nhome page for today. 19350 18:30:32,542 --> 18:30:34,982 But this is what it\nlooks like right now. 19351 18:30:34,982 --> 18:30:38,002 Well, I typed in the URL, but\nnotice it changed a little bit. 19352 18:30:38,001 --> 18:30:41,011 It actually sent me to\nHTTPS and added the www 19353 18:30:41,012 --> 18:30:42,682 even though I didn't type that. 19354 18:30:42,682 --> 18:30:46,702 But it turns out we can poke around\n 19355 18:30:47,812 --> 18:30:50,542 I'm going to start to use incognito\n 19356 18:30:50,542 --> 18:30:52,500 care that people know\nI'm visiting Harvard.edu 19357 18:30:52,500 --> 18:30:56,471 but because it throws away\nany history that I just did. 19358 18:30:56,471 --> 18:30:58,971 So that every request is going\nto look like a brand new one 19359 18:30:58,971 --> 18:31:01,429 and that's just useful\ndiagnostically, because we're always 19360 18:31:01,429 --> 18:31:02,781 going to see fresh information. 19361 18:31:02,782 --> 18:31:06,082 My browser is not going to remember\n 19362 18:31:06,081 --> 18:31:09,441 But I'm going to go\nup to View, developer 19363 18:31:09,441 --> 18:31:13,042 developer tools, which is something\n 19364 18:31:13,042 --> 18:31:15,652 And there's something\nanalogous for Firefox and Edge 19365 18:31:15,652 --> 18:31:17,631 and Safari and other browsers. 19366 18:31:17,631 --> 18:31:20,301 Developer tools is going to\nopen up these tabs down here. 19367 18:31:20,301 --> 18:31:23,509 I don't really care what's new, so I'm\n 19368 18:31:23,509 --> 18:31:26,182 And I'm going to hover over the\nNetwork tab for just a moment. 19369 18:31:26,182 --> 18:31:31,252 And now I'm going to go and\nsay HTTP://Harvard.edu, so 19370 18:31:32,422 --> 18:31:36,022 I'm going to hit Enter,\nand a whole bunch of stuff 19371 18:31:36,021 --> 18:31:37,342 just flew across the screen. 19372 18:31:38,601 --> 18:31:43,702 And if I zoom in down here, my God,\n 19373 18:31:43,702 --> 18:31:48,082 is downloading, what 17, 18,\n19 megabytes, 20 megabytes 19374 18:31:48,081 --> 18:31:53,182 millions of bytes of information,\nover 111 HTTP requests. 19375 18:31:53,182 --> 18:31:56,062 In other words, a bit of a\nsimplification, but my browser 19376 18:31:56,062 --> 18:31:59,692 unbeknownst to me, sent one\nenvelope initially with the request. 19377 18:31:59,691 --> 18:32:01,611 Then the server said,\nOK, by the way, there's 19378 18:32:01,611 --> 18:32:05,241 110 other things you need, 112\nother things you need to get. 19379 18:32:05,241 --> 18:32:09,441 So my computer went back and forth,\n 19380 18:32:10,191 --> 18:32:13,581 Well, inside of Harvard's web\npage is a whole bunch of images 19381 18:32:13,581 --> 18:32:16,341 and maybe sound files and\nvideos and other stuff 19382 18:32:16,342 --> 18:32:18,741 that all need to be\ndownloaded and to compose 19383 18:32:18,741 --> 18:32:20,161 what is ultimately the web page. 19384 18:32:20,161 --> 18:32:22,881 But I don't care about like\n100 plus of these things. 19385 18:32:22,881 --> 18:32:25,161 Let's focus on the very first one first. 19386 18:32:25,161 --> 18:32:27,711 The very first request\nI sent was up here. 19387 18:32:27,711 --> 18:32:30,771 And I'm going to click on this\nrow, under the Network tab. 19388 18:32:30,771 --> 18:32:33,501 And then I'm going to see a\nbit of diagnostic information. 19389 18:32:33,501 --> 18:32:36,741 To an average person using the\n 19390 18:32:36,741 --> 18:32:39,292 just as you probably didn't\ncare about it until right now. 19391 18:32:41,001 --> 18:32:44,991 But if I scroll down to\nrequest headers, you will see 19392 18:32:44,991 --> 18:32:48,952 if I click View source, literally\n 19393 18:32:48,952 --> 18:32:51,282 my Mac just sent to Harvard.edu. 19394 18:32:51,282 --> 18:32:56,482 Two of the lines are familiar,\nget/http1.1, host:harvard.edu 19395 18:32:56,482 --> 18:32:59,942 and then other stuff that, for now,\n 19396 18:32:59,941 --> 18:33:03,381 But let's look at the response\nthat came back from the server. 19397 18:33:03,381 --> 18:33:08,421 I'm going to scroll up now and\n 19398 18:33:11,211 --> 18:33:14,032 There's no 200, there's no word OK. 19399 18:33:14,032 --> 18:33:18,112 Curiously, harvard.edu\nhas moved permanently. 19400 18:33:18,953 --> 18:33:20,661 Well, there's a whole\nbunch of stuff here 19401 18:33:20,661 --> 18:33:22,119 that's not that interesting for us. 19402 18:33:22,119 --> 18:33:24,781 But this line, location, is interesting. 19403 18:33:24,782 --> 18:33:28,552 This is an HTTP header, a\nstandardized key value pair 19404 18:33:28,551 --> 18:33:32,301 that's part of the HTTP\nprotocol, that is, conventions. 19405 18:33:32,301 --> 18:33:34,881 And if I highlight just\nthis one, it's telling me 19406 18:33:34,881 --> 18:33:38,421 mm-mmm, Harvard is not\nat HTTP://Harvard.edu 19407 18:33:38,422 --> 18:33:44,242 Harvard's website is now, and perhaps\n 19408 18:33:47,631 --> 18:33:50,641 Probably someone at Harvard wants\n 19409 18:33:50,642 --> 18:33:53,092 So they redirected you\nfrom HTTP to HTTPS. 19410 18:33:53,092 --> 18:33:57,392 Maybe the marketing people want you to\n 19411 18:33:57,892 --> 18:34:00,350 Just to standardize things,\nbut there are technical reasons 19412 18:34:00,350 --> 18:34:03,442 to use a hostname, and not\njust the raw domain name. 19413 18:34:03,441 --> 18:34:06,331 And all this other stuff is sort\n 19414 18:34:06,331 --> 18:34:11,361 now, because a browser that\nreceives a 301 response knows 19415 18:34:11,361 --> 18:34:16,461 by design, by the definition of HTTP,\n 19416 18:34:16,461 --> 18:34:20,065 And that's why, in my browser, all of\n 19417 18:34:20,065 --> 18:34:22,732 because I didn't really know or\ncare about all of those headers. 19418 18:34:22,732 --> 18:34:26,482 But that's why and how I\nended up at this URL here. 19419 18:34:26,482 --> 18:34:29,842 My browser was told to go\nelsewhere via that new location. 19420 18:34:29,842 --> 18:34:32,392 And the browser just\nfollowed those breadcrumbs 19421 18:34:32,392 --> 18:34:35,122 if you will, at which point it\n 19422 18:34:35,122 --> 18:34:39,562 and files, and so forth, that\ncompose this particular page. 19423 18:34:40,638 --> 18:34:42,471 And let me actually go\ninto VS Code, if only 19424 18:34:42,471 --> 18:34:45,596 because it's a little more pleasant to\n 19425 18:34:45,596 --> 18:34:49,201 without actually using\na full-fledged browser. 19426 18:34:49,202 --> 18:34:51,381 So now let's just use\nan equivalent program. 19427 18:34:51,381 --> 18:34:54,351 It's called Curl, for\nconnecting to a URL, that's 19428 18:34:54,351 --> 18:34:57,262 going to allow me to play with\n 19429 18:34:57,262 --> 18:34:59,872 without bothering to download\nall the images and text 19430 18:34:59,872 --> 18:35:01,312 and so forth from the website. 19431 18:35:01,312 --> 18:35:03,902 It's going to allow me to\ndo something like this. 19432 18:35:03,902 --> 18:35:10,941 Let me go ahead and run, for instance,\n 19433 18:35:10,941 --> 18:35:14,572 line arguments that says\nsimulate a GET request textually 19434 18:35:15,831 --> 18:35:20,661 And let's go to\nHTTP://Harvard.edu Enter. 19435 18:35:20,661 --> 18:35:23,834 Now, by way of how Curl, works,\nI'm just seeing the headers. 19436 18:35:23,834 --> 18:35:25,792 It didn't bother downloading\nthe whole website. 19437 18:35:25,792 --> 18:35:28,792 And you see exactly the same\nthing, 301 moved permanently. 19438 18:35:28,792 --> 18:35:31,562 Location is, indeed, this one here. 19439 18:35:31,562 --> 18:35:33,202 So that's kind of interesting. 19440 18:35:33,202 --> 18:35:34,881 But let's follow it manually now. 19441 18:35:34,881 --> 18:35:37,682 Let's now do what it's telling me to do. 19442 18:35:37,682 --> 18:35:42,381 Let's go to the location, with\nHTTPS and the www and hit Enter. 19443 18:35:42,381 --> 18:35:46,402 And now, what's a good\nsign with this output? 19444 18:35:48,831 --> 18:35:51,741 SPEAKER 1: 200 OK, that\nmeans I'm seeing, presumably 19445 18:35:51,741 --> 18:35:55,221 if I were using a real browser,\n 19446 18:35:55,221 --> 18:35:58,911 Looks like Harvard's version of HTTP\n 19447 18:35:58,911 --> 18:36:01,051 It's using HTTP version\n2, which is fine. 19448 18:36:01,051 --> 18:36:04,611 But 200 is indeed indicative\nof things being OK. 19449 18:36:04,611 --> 18:36:07,822 Well, what if I try\nvisiting some bogus URL 19450 18:36:07,822 --> 18:36:14,331 like Harvard.edu, when this file does\n 19451 18:36:14,331 --> 18:36:17,331 probably doesn't exist, and hit Enter. 19452 18:36:17,331 --> 18:36:20,991 What do you see now, that's perhaps\nfamiliar, in the real world? 19453 18:36:24,620 --> 18:36:26,600 All of us have seen\nthis probably endlessly 19454 18:36:26,600 --> 18:36:29,301 from time to time, when you\nscrew up by mis-typing a URL 19455 18:36:29,301 --> 18:36:31,221 or someone deletes the\nweb page in question. 19456 18:36:31,221 --> 18:36:34,221 But all that is is a\nstatus code that a browser 19457 18:36:34,221 --> 18:36:37,521 is being sent from the server,\nthat's a little clue as to what 19458 18:36:37,521 --> 18:36:40,383 the actual problem is,\nunderneath the hood. 19459 18:36:40,384 --> 18:36:42,092 So instead of getting\nback, for instance 19460 18:36:42,092 --> 18:36:45,202 something like OK, or moved permanently,\n 19461 18:36:47,902 --> 18:36:51,711 Well, it turns out there's\nother types of status codes 19462 18:36:51,711 --> 18:36:55,211 that you'll start to see over time,\n 19463 18:36:57,881 --> 18:37:01,421 302, 304, 307 are all similar in spirit. 19464 18:37:01,422 --> 18:37:04,782 They're related to redirecting the\n 19465 18:37:04,782 --> 18:37:08,802 401, 403, unauthorized or forbidden. 19466 18:37:08,801 --> 18:37:11,021 If you ever mess up\nyour password, or you 19467 18:37:11,021 --> 18:37:13,134 try visiting a URL you're\nnot supposed to look at 19468 18:37:13,134 --> 18:37:15,551 you might see one of these\ncodes, indicating that you just 19469 18:37:15,551 --> 18:37:17,441 don't have authorization for those. 19470 18:37:17,441 --> 18:37:21,671 404 not found, 418, I'm a\nteapot, was an April Fool's joke 19471 18:37:21,672 --> 18:37:24,072 by the tech community years ago. 19472 18:37:25,331 --> 18:37:27,281 And, unfortunately,\nall of you are probably 19473 18:37:27,282 --> 18:37:30,972 on a path now to creating\nHTTP 500 errors, once 19474 18:37:30,971 --> 18:37:33,114 next week, we start writing\ncode, because all of us 19475 18:37:34,032 --> 18:37:37,632 We're going to have typos, logical\n 19476 18:37:37,631 --> 18:37:42,042 just like segfaults were in the world of\n 19477 18:37:42,042 --> 18:37:45,254 503 service unavailable, means\nmaybe the server is overloaded 19478 18:37:46,211 --> 18:37:47,721 And there's other codes there. 19479 18:37:47,721 --> 18:37:51,011 But those are perhaps some\nof the most common ones. 19480 18:37:51,012 --> 18:37:54,642 Has anyone, we can get away with\n 19481 18:37:54,642 --> 18:37:58,972 has anyone ever visited\nSafetySchool.org? 19482 18:38:01,542 --> 18:38:08,381 HTTP://SafetySchool.org,\ndare we do this, Enter. 19483 18:38:17,262 --> 18:38:20,051 --so this has been like a\njoke for like 10 or 20 years. 19484 18:38:20,051 --> 18:38:22,811 Someone out there has been\npaying for the domain name 19485 18:38:22,812 --> 18:38:26,202 safetyschool.org, just for\nthis two second demonstration. 19486 18:38:26,202 --> 18:38:28,542 But we can now infer, how did this work? 19487 18:38:28,542 --> 18:38:31,452 The person who bought that domain\nname and somehow configured 19488 18:38:31,452 --> 18:38:35,502 DNS to point to their web server,\n 19489 18:38:35,501 --> 18:38:37,871 what is their web server\npresumably spitting out 19490 18:38:37,872 --> 18:38:41,021 whenever a browser requests the page? 19491 18:38:48,221 --> 18:38:49,911 Let me increase my terminal window. 19492 18:38:49,911 --> 18:38:58,271 Let me do Curl-I-xget\nHTTP://safetyschool.org Enter 19493 18:38:58,271 --> 18:39:00,012 and that's all this website does. 19494 18:39:00,012 --> 18:39:02,322 There's not even an\nactual website there. 19495 18:39:02,322 --> 18:39:04,842 No HTML, no CSS languages\nwe're about to see. 19496 18:39:04,842 --> 18:39:09,702 It literally just exists on the\n 19497 18:39:09,702 --> 18:39:13,362 In fairness, there are others. 19498 18:39:13,361 --> 18:39:15,611 Let me actually do another one here. 19499 18:39:15,611 --> 18:39:19,511 Instead of safetyschool.org,\nturns out someone 19500 18:39:19,512 --> 18:39:25,572 some years ago, bought\nHarvardSucks.org Enter. 19501 18:39:25,572 --> 18:39:30,072 And when we do this, you'll see that,\n 19502 18:39:36,762 --> 18:39:38,785 This demo actually\nworked for so many years. 19503 18:39:38,785 --> 18:39:41,202 But someone has stopped paying\nfor the Squarespace account 19504 18:39:46,471 --> 18:39:52,111 OK, so, fortunately, we\ndid save the YouTube video 19505 18:39:53,672 --> 18:39:56,255 And so, just to put this\ninto context, since it's 19506 18:39:56,255 --> 18:39:58,422 been quite a few years,\nHarvard and Yale, of course 19507 18:39:58,422 --> 18:40:00,272 have this long-standing rivalry. 19508 18:40:00,271 --> 18:40:02,891 There is this tradition\nof pranking each other. 19509 18:40:02,892 --> 18:40:06,822 And, honestly, hands down, one of the\n 19510 18:40:09,062 --> 18:40:10,741 It's about a three-minute retrospective. 19511 18:40:10,741 --> 18:40:13,074 It's one of the earliest\nvideos, I dare say, on YouTube 19512 18:40:13,074 --> 18:40:15,512 so the quality is\nrepresentative of that. 19513 18:40:15,512 --> 18:40:18,402 But let me go ahead and\nfull screen my page here. 19514 18:40:18,402 --> 18:40:22,892 And what used to live at\nHarvardSucks.org is this video here. 19515 18:40:22,892 --> 18:40:25,738 If we could dim the lights\nfor about three minutes. 19516 18:40:53,975 --> 18:40:55,892 - Actually we're going\nall the way to the top. 19517 18:41:01,142 --> 18:41:04,322 - We're here to trip up Harvard. 19518 18:41:06,750 --> 18:41:08,402 - Pass from the top one, pass it down. 19519 18:41:09,714 --> 18:41:12,822 - It's nice to say the ERA sucks. 19520 18:41:19,732 --> 18:41:22,402 It's going to have to happen. 19521 18:41:22,402 --> 18:41:25,032 - It's actually going to happen. 19522 18:41:25,032 --> 18:41:26,452 I can't [BEEP] believe this. 19523 18:41:26,452 --> 18:41:28,512 - What do you think of Yale? 19524 18:41:33,334 --> 18:41:34,542 - Because they don't have it. 19525 18:41:37,013 --> 18:41:40,341 - Probably that's going\nto be legible, very small. 19526 18:41:45,801 --> 18:41:47,512 - Says, are we in boats now? 19527 18:41:48,262 --> 18:41:49,387 - How many extra are there? 19528 18:41:54,445 --> 18:41:55,903 - You guys are from Harvard, right? 19529 18:41:58,611 --> 18:42:00,039 - Just make sure everyone has one. 19530 18:42:35,631 --> 18:42:38,891 - What do you think of Yale, sir? 19531 18:42:38,892 --> 18:42:41,376 - Going to be, do one more time. 19532 18:43:08,812 --> 18:43:13,222 SPEAKER 1: All right, so thanks to\n 19533 18:43:13,221 --> 18:43:17,129 Let's go ahead here and consider, in\n 19534 18:43:17,130 --> 18:43:18,922 down inside of the\nenvelope, because we now 19535 18:43:18,922 --> 18:43:24,202 have the ability to get data from,\n 19536 18:43:25,342 --> 18:43:30,082 Let's consider for just a moment\n 19537 18:43:30,081 --> 18:43:33,652 that we now have this ability to\n 19538 18:43:33,652 --> 18:43:37,490 And we have the ability to\nspecify in those envelopes what 19539 18:43:37,490 --> 18:43:38,782 it is we want from the website. 19540 18:43:38,782 --> 18:43:40,192 We want to get the home page. 19541 18:43:40,191 --> 18:43:41,932 We want to get back the HTML. 19542 18:43:43,247 --> 18:43:46,372 In fact, we don't yet have the language\n 19543 18:43:46,372 --> 18:43:48,494 are written, namely HTML and CSS. 19544 18:43:48,494 --> 18:43:50,702 But let's go ahead and take\na five minute break here. 19545 18:43:50,702 --> 18:43:54,182 And when we come back, we'll\nlearn those two languages. 19546 18:43:56,072 --> 18:43:58,372 So we've got three\nlanguages to look at today. 19547 18:43:58,372 --> 18:44:00,832 But two of them are not\nactually programming languages. 19548 18:44:00,831 --> 18:44:05,152 What makes something a programming\n 19549 18:44:05,152 --> 18:44:08,472 is that there are these constructs via\n 19550 18:44:08,471 --> 18:44:10,971 You might have variables, you\nmight have looping constructs. 19551 18:44:10,971 --> 18:44:13,341 You have the ability,\nultimately, to express logic. 19552 18:44:13,342 --> 18:44:16,912 HTML and CSS aren't so much about\n 19553 18:44:16,911 --> 18:44:18,441 and the aesthetics of a page. 19554 18:44:18,441 --> 18:44:21,262 And so we're going to create\nthe skeleton of a web page using 19555 18:44:21,262 --> 18:44:23,190 this pair of languages, HTML and CSS. 19556 18:44:23,190 --> 18:44:24,982 And then toward the\nend of the today, we'll 19557 18:44:24,982 --> 18:44:26,872 introduce an actual\nprogramming language 19558 18:44:26,872 --> 18:44:30,232 that actually is pretty similar\nin spirit and syntactically 19559 18:44:30,232 --> 18:44:34,101 to both C and Python, but that's going\n 19560 18:44:34,101 --> 18:44:38,251 just static, things that you look at,\n 19561 18:44:38,251 --> 18:44:42,771 And then next week again, in week 9,\n 19562 18:44:42,771 --> 18:44:46,251 tie all of this together, so that you\n 19563 18:44:46,251 --> 18:44:49,221 talking to a back-end\nserver, and creating 19564 18:44:49,221 --> 18:44:53,151 the experience that you and I now take\n 19565 18:44:53,812 --> 18:44:55,187 Well, let's go ahead and do this. 19566 18:44:55,187 --> 18:44:58,402 Let's quickly whip up something\nin this language called HTML. 19567 18:44:59,601 --> 18:45:04,101 I'm going to go ahead and create a\n 19568 18:45:04,101 --> 18:45:07,491 The convention is typically to\nend your file names in dot html. 19569 18:45:07,491 --> 18:45:09,783 And I'm going to go ahead\nand bang this out real quick. 19570 18:45:09,783 --> 18:45:12,842 But then we'll more slowly step\n 19571 18:45:12,842 --> 18:45:17,482 So I'm going to say doctype\nhtml open bracket html 19572 18:45:17,482 --> 18:45:22,072 and then notice I'm going to do open\n 19573 18:45:22,072 --> 18:45:25,672 And I'm leveraging a feature of VS\n 19574 18:45:25,672 --> 18:45:27,601 generally, to do a bit of autocomplete. 19575 18:45:27,601 --> 18:45:30,831 So you'll see that there's this symmetry\n 19576 18:45:30,831 --> 18:45:32,631 but I'm not typing all of these things. 19577 18:45:32,631 --> 18:45:37,611 VS Code is automatically generating the\n 19578 18:45:37,611 --> 18:45:41,511 Let me go ahead and\nsay, Open the head tag. 19579 18:45:42,831 --> 18:45:44,932 I'll say something cute\nlike, Hello, title. 19580 18:45:44,932 --> 18:45:47,661 And then down here, I'm going to\n 19581 18:45:47,661 --> 18:45:49,461 and say something like Hello, body. 19582 18:45:49,461 --> 18:45:53,241 And let me specify at the very top,\n 19583 18:45:54,592 --> 18:46:00,562 So at this moment, I have a file in my\n 19584 18:46:00,562 --> 18:46:03,484 VS Code as we're using it,\nof course, is cloud-based. 19585 18:46:03,483 --> 18:46:05,691 We're using it in a browser,\neven though you can also 19586 18:46:05,691 --> 18:46:07,792 download it and run it on a Mac and PC. 19587 18:46:07,792 --> 18:46:10,491 So we are in this weird\nsituation where I'm 19588 18:46:10,491 --> 18:46:12,771 using the cloud to\ncreate a web page, and I 19589 18:46:12,771 --> 18:46:17,781 want that web page to also live in\n 19590 18:46:17,782 --> 18:46:21,082 But the thing about VS\nCode, or really any website 19591 18:46:21,081 --> 18:46:24,741 that you might use in a browser, by\n 19592 18:46:24,741 --> 18:46:29,361 TCP port number 80 or\nTCP port number 443 19593 18:46:29,361 --> 18:46:32,061 which is HTTP and HTTPS respectively. 19594 18:46:32,062 --> 18:46:34,912 But here I am, sort of\na programmer myself 19595 18:46:34,911 --> 18:46:39,241 trying to create my own\nwebsite on an existing website. 19596 18:46:39,241 --> 18:46:40,732 So it's a bit of a weird situation. 19597 18:46:40,732 --> 18:46:43,222 But that's OK, because\nwhat's nice about TCP 19598 18:46:43,221 --> 18:46:47,572 is that you and I can just pick port\n 19599 18:46:49,312 --> 18:46:51,802 That is, we can control\nthe environment entirely 19600 18:46:51,801 --> 18:46:57,711 by just running our own web server\n 19601 18:46:58,732 --> 18:47:01,445 This is a command that we\npreinstalled in VS Code here. 19602 18:47:01,445 --> 18:47:03,112 And you'll notice a pop-up just came up. 19603 18:47:03,111 --> 18:47:05,841 Your application running\non port 8080 is available. 19604 18:47:05,842 --> 18:47:09,172 That's a commonly used TCP port\nnumber, when 80 is already used 19605 18:47:09,172 --> 18:47:11,932 and 443 is already used,\nyou can run your own server 19606 18:47:11,932 --> 18:47:14,521 on your own port, 8080 in this case. 19607 18:47:14,521 --> 18:47:18,592 I've opened that tab in advance, and\n 19608 18:47:18,592 --> 18:47:22,252 here I see a so-called directory\n 19609 18:47:22,251 --> 18:47:24,322 So I don't see any of my other files. 19610 18:47:24,322 --> 18:47:27,351 I don't see anything\nbelonging to VS Code itself. 19611 18:47:27,351 --> 18:47:30,562 I only see the file that I've created\n 19612 18:47:31,851 --> 18:47:36,232 And so if I click on this file\nnow, I should see Hello, body. 19613 18:47:37,312 --> 18:47:39,262 But that's because the\ntitle of a web page 19614 18:47:39,262 --> 18:47:41,699 nowadays is typically\nembedded in the tab. 19615 18:47:41,698 --> 18:47:44,031 And if I'm full screen in my\nbrowser, there are no tabs. 19616 18:47:44,032 --> 18:47:45,652 So let me minimize the window a bit. 19617 18:47:45,652 --> 18:47:50,032 And now you can see just in this\n 19618 18:47:50,032 --> 18:47:52,382 here, that Hello, body, is\nin the top left hand corner. 19619 18:47:52,381 --> 18:47:54,801 And if I zoom in, there's Hello, title. 19620 18:47:56,271 --> 18:48:01,491 I have gone ahead and created\nmy own web page in HTML 19621 18:48:01,491 --> 18:48:04,851 in a file called Hello.html. 19622 18:48:04,851 --> 18:48:09,251 And then I have opened up\na web server of my own 19623 18:48:09,251 --> 18:48:11,591 configured it to listen\non TCP port 8080 19624 18:48:11,592 --> 18:48:14,982 which just says to the internet, hey,\n 19625 18:48:14,982 --> 18:48:18,612 not on the standard port number,\n80 or 443, listen on 8080. 19626 18:48:18,611 --> 18:48:22,481 And this means I can develop a website\n 19627 18:48:22,482 --> 18:48:24,851 here, which is\nincreasingly common today. 19628 18:48:24,851 --> 18:48:28,691 All right, so now let's consider\n 19629 18:48:28,691 --> 18:48:32,891 HTML is characterized really by just\n 19630 18:48:33,732 --> 18:48:36,972 Most of what I just typed were tags,\n 19631 18:48:37,732 --> 18:48:41,872 Here's the same source code that I\n 19632 18:48:41,872 --> 18:48:43,122 Let's consider what this is. 19633 18:48:43,122 --> 18:48:46,631 The very first line of\ncode here, doctype html 19634 18:48:47,892 --> 18:48:51,522 It's the only one that starts with\n 19635 18:48:52,782 --> 18:48:55,392 There's no more exclamation\npoints thereafter, for now. 19636 18:48:55,392 --> 18:48:58,842 This is the document type declaration,\n 19637 18:48:58,842 --> 18:49:00,432 it's just got to be there nowadays. 19638 18:49:00,432 --> 18:49:02,974 It's like a little breadcrumb\nat the beginning of a file that 19639 18:49:02,974 --> 18:49:08,021 says to the browser, you are about to\n 19640 18:49:08,021 --> 18:49:11,000 That line of code has changed\nover time, over the years. 19641 18:49:11,000 --> 18:49:13,542 The most recent version of it\nis nice and succinct like this 19642 18:49:13,542 --> 18:49:16,572 and it's just a clue to the\nbrowser as to what version of HTML 19643 18:49:16,572 --> 18:49:18,881 is being used by you, the programmer. 19644 18:49:18,881 --> 18:49:20,781 All right, what comes after that? 19645 18:49:20,782 --> 18:49:23,592 Well, after that, and I've\nhighlighted two things in yellow 19646 18:49:23,592 --> 18:49:26,442 this is what we're going to\nstart calling an open tag 19647 18:49:26,441 --> 18:49:30,911 or a start tag, open bracket HTML\nthen something, close bracket 19648 18:49:30,911 --> 18:49:32,991 is the so-called start or open tag. 19649 18:49:32,991 --> 18:49:35,741 Then the corresponding close\nor end tag is down here. 19650 18:49:37,092 --> 18:49:40,252 You use the same tag number, you\nuse the same angled brackets. 19651 18:49:40,251 --> 18:49:43,121 But you do add a slash, and\nyou don't repeat yourself 19652 18:49:43,122 --> 18:49:46,032 with any of the things\ncalled attributes 19653 18:49:46,032 --> 18:49:47,772 because, what is this thing here? 19654 18:49:47,771 --> 18:49:50,891 Lang equals quote unquote\n"en," means the language 19655 18:49:50,892 --> 18:49:53,632 of my page is written\nin the English language. 19656 18:49:53,631 --> 18:49:56,771 The humans have standardized\ntwo and three letter codes 19657 18:49:56,771 --> 18:49:59,451 for every human language, right now. 19658 18:49:59,452 --> 18:50:03,131 And so this is just a clue to the\n 19659 18:50:03,131 --> 18:50:06,402 and accessibility purposes,\nwhat language the web page 19660 18:50:07,331 --> 18:50:10,661 Not the tags, but the words, like\nHello, title and Hello, body 19661 18:50:10,661 --> 18:50:13,182 which while minimalist,\nare indeed in English. 19662 18:50:13,182 --> 18:50:16,142 So when you close a tag, you close\nthe name of it with the slash 19663 18:50:17,111 --> 18:50:19,319 You don't repeat the attribute. 19664 18:50:19,320 --> 18:50:21,862 That would just be annoying to\nhave to type everything again. 19665 18:50:21,861 --> 18:50:23,028 But notice the pattern here. 19666 18:50:24,262 --> 18:50:27,551 But this is another example of\nkey value pairs in computing. 19667 18:50:27,551 --> 18:50:31,631 The key is Lang, the\nvalue is E-N for English. 19668 18:50:31,631 --> 18:50:34,182 The attribute is called\nLang, the value is 19669 18:50:34,182 --> 18:50:38,142 called, it is E-N. So again,\nit's just key value pairs 19670 18:50:38,142 --> 18:50:39,552 in just yet another context. 19671 18:50:39,551 --> 18:50:41,682 Probably the browser's using a\nhash table underneath the hood 19672 18:50:41,682 --> 18:50:44,807 to keep track of this stuff, like a\n 19673 18:50:44,807 --> 18:50:48,292 Again, humans keep using the same\n 19674 18:50:49,732 --> 18:50:52,241 The nesting is important\nvisually, not to the computer 19675 18:50:52,241 --> 18:50:54,491 but to us, the humans,\nbecause it implies 19676 18:50:54,491 --> 18:50:55,971 that there's some hierarchy here. 19677 18:50:55,971 --> 18:50:59,781 And, indeed, what is inside\nof the HTML tag here? 19678 18:50:59,782 --> 18:51:02,892 Well, we have what\nwe'll call the head tag. 19679 18:51:02,892 --> 18:51:06,252 The head tag says, hey, browser,\n 19680 18:51:06,251 --> 18:51:08,519 And then the body tag\nsays, hey, browser 19681 18:51:08,519 --> 18:51:09,851 here comes the body of the page. 19682 18:51:09,851 --> 18:51:13,902 The body is like 99% of the user's\n 19683 18:51:13,902 --> 18:51:17,532 The head is really just the address\n 19684 18:51:17,532 --> 18:51:20,292 like the title that we saw a moment ago. 19685 18:51:20,292 --> 18:51:24,732 Just to introduce the vernacular,\n 19686 18:51:24,732 --> 18:51:29,202 as an element, has two children,\n 19687 18:51:29,202 --> 18:51:31,961 which is to say that head\nand body are now siblings. 19688 18:51:31,961 --> 18:51:35,111 So you can use the same kind of\n 19689 18:51:35,111 --> 18:51:37,421 when talking about trees, weeks ago. 19690 18:51:37,422 --> 18:51:41,802 If we look at the head tag, how\n 19691 18:51:41,801 --> 18:51:44,591 I'm seeing one, and,\nindeed, at least if we 19692 18:51:44,592 --> 18:51:48,851 ignore all the white space, the\n 19693 18:51:48,851 --> 18:51:51,072 there's just one child, a title element. 19694 18:51:51,072 --> 18:51:55,032 And an element is the terminology that\n 19695 18:51:56,422 --> 18:51:58,242 So this is the title element. 19696 18:51:58,241 --> 18:52:01,812 And the title element has one\nchild, which is just pure text 19697 18:52:01,812 --> 18:52:03,792 otherwise known as a text node. 19698 18:52:03,792 --> 18:52:06,911 Recall, node, from our discussions\nof data structures weeks ago. 19699 18:52:06,911 --> 18:52:10,961 If we jump then to the body, which\n 19700 18:52:10,961 --> 18:52:14,922 it too has one child, which is just\n 19701 18:52:14,922 --> 18:52:17,292 that says, quote unquote "Hello, body. 19702 18:52:17,292 --> 18:52:21,042 What's nice about this indentation,\n 19703 18:52:21,042 --> 18:52:25,312 is not going to care, is that it\nimplies this kind of structure. 19704 18:52:25,312 --> 18:52:28,902 And this is where we connect,\nlike weeks 5 and now weeks 8, here 19705 18:52:28,902 --> 18:52:33,282 is the tree structure we began to\n 19706 18:52:33,282 --> 18:52:35,585 It's not a binary tree,\neven though this one happens 19707 18:52:35,585 --> 18:52:37,002 to have no more than two children. 19708 18:52:37,001 --> 18:52:40,961 It's an arbitrary tree that can\n 19709 18:52:40,961 --> 18:52:43,842 But if we have a special node\nhere that refers to the document 19710 18:52:43,842 --> 18:52:47,082 the root node, so to speak, is\nHTML, drawn with a rectangle 19711 18:52:47,081 --> 18:52:48,671 here, just for discussion's sake. 19712 18:52:48,672 --> 18:52:51,522 It has two children, head\nand body, also rectangles. 19713 18:52:51,521 --> 18:52:54,851 Head has a title child,\nand then it and body 19714 18:52:54,851 --> 18:52:57,861 have text nodes, which I've\ndrawn with ovals instead. 19715 18:52:57,861 --> 18:53:01,301 Which is only to say that when your\n 19716 18:53:01,301 --> 18:53:03,762 downloads a web page,\nopens up that envelope 19717 18:53:03,762 --> 18:53:06,771 and sees the contents that\nhave come back from the server 19718 18:53:06,771 --> 18:53:10,221 it essentially reads the\ncode that someone wrote 19719 18:53:10,221 --> 18:53:12,631 the HTML code, top to\nbottom, left to right 19720 18:53:12,631 --> 18:53:16,402 and creates in the browser's\nmemory, in your Mac or your PC 19721 18:53:16,402 --> 18:53:20,241 or your phone's memory or RAM,\nthis kind of data structure. 19722 18:53:20,241 --> 18:53:22,131 That's what's going on\nunderneath the hood. 19723 18:53:22,131 --> 18:53:24,592 And that's why aesthetically,\nit's just nice, as a human 19724 18:53:24,592 --> 18:53:27,802 to indent things stylistically,\nbecause it's very clear then 19725 18:53:27,801 --> 18:53:32,701 to you, and to other programmers,\n 19726 18:53:32,702 --> 18:53:36,112 So that's it for like\nthe fundamentals of HTML. 19727 18:53:36,111 --> 18:53:38,451 We'll see a bunch of tags\nand a bunch of examples now. 19728 18:53:38,452 --> 18:53:40,682 But HTML is just tags and attributes. 19729 18:53:40,682 --> 18:53:43,432 And it's the kind of thing that\n 19730 18:53:43,432 --> 18:53:45,142 Eventually, many of them get ingrained. 19731 18:53:45,142 --> 18:53:47,939 I constantly check the reference\nguides or stack overflow 19732 18:53:47,938 --> 18:53:50,271 if I'm trying to figure out,\nhow do I lay something out. 19733 18:53:50,271 --> 18:53:52,063 It's really just these\nbuilding blocks that 19734 18:53:52,063 --> 18:53:55,131 allow you to assemble the\nstructure of a web page. 19735 18:53:55,131 --> 18:53:58,551 This one is being super simple,\n 19736 18:53:58,551 --> 18:54:01,101 Any questions on this\nframework, before we 19737 18:54:01,101 --> 18:54:05,489 start to add more tags, more\nvocabulary, if you will? 19738 18:54:06,322 --> 18:54:09,301 AUDIENCE: What would happen\nif we put the title tag? 19739 18:54:09,301 --> 18:54:13,091 SPEAKER 1: If we put the hello tag\n 19740 18:54:13,751 --> 18:54:20,251 So let me actually go to this,\nand say open bracket title 19741 18:54:20,251 --> 18:54:23,461 whoops, sometimes you don't want\n 19742 18:54:24,601 --> 18:54:27,031 I've gone ahead and changed the file. 19743 18:54:27,032 --> 18:54:30,362 Let me go and open up, give me a\n 19744 18:54:30,361 --> 18:54:34,411 and go back to the URL that has my page. 19745 18:54:41,161 --> 18:54:45,891 And let me go ahead now\nand click on Hello.html. 19746 18:54:45,892 --> 18:54:49,352 And in this case, it looks like\nwe don't actually see anything. 19747 18:54:49,351 --> 18:54:50,631 So the browser is hiding it. 19748 18:54:50,631 --> 18:54:54,051 Technically speaking, browsers\ntend to be pretty generous. 19749 18:54:54,051 --> 18:54:56,331 And half the time, when\nyou make mistakes in HTML 19750 18:54:56,331 --> 18:54:58,101 it will display, it might display-- 19751 18:54:58,101 --> 18:54:59,437 not display as you intend it. 19752 18:54:59,437 --> 18:55:03,022 It might not display the same on\n 19753 18:55:03,021 --> 18:55:05,211 There is a tool, though,\nthat we'll see, that 19754 18:55:05,211 --> 18:55:07,042 can help answer this question for you. 19755 18:55:07,042 --> 18:55:11,032 For instance, if I go\nto Validator.w3.org 19756 18:55:11,032 --> 18:55:13,072 W3 is the World Wide\nWeb Consortium, a group 19757 18:55:13,072 --> 18:55:15,232 of people that standardize\nthis kind of stuff 19758 18:55:15,232 --> 18:55:17,631 I can click on Validate\nby direct input, and just 19759 18:55:17,631 --> 18:55:21,411 copy paste my sample HTML into\nthis box, and click Check. 19760 18:55:21,411 --> 18:55:24,051 And I should see,\nhopefully, that indeed, it's 19761 18:55:24,051 --> 18:55:25,671 an error, what you proposed that I do. 19762 18:55:25,672 --> 18:55:27,744 The browser just did its\nbest to do something 19763 18:55:27,744 --> 18:55:30,952 which was to show me nothing at least,\n 19764 18:55:30,952 --> 18:55:34,491 But if I revert that change, and\nlet me undo what we just did 19765 18:55:34,491 --> 18:55:39,411 let me copy my original code back\n 19766 18:55:39,411 --> 18:55:41,831 now you can see, conversely,\nmy code is now correct. 19767 18:55:41,831 --> 18:55:43,581 And there's automated\ntools to check that. 19768 18:55:43,581 --> 18:55:45,873 But we'll encourage you, for\nproblem sets and projects 19769 18:55:45,873 --> 18:55:48,831 to use that particular manual tool. 19770 18:55:48,831 --> 18:55:51,621 All right, so let's go ahead\nand enhance this a little bit 19771 18:55:51,622 --> 18:55:53,572 by introducing a whole\nbunch of tags, just 19772 18:55:53,572 --> 18:55:55,952 to give you a sense of some\nof the building blocks here. 19773 18:55:55,952 --> 18:56:01,292 So I'm going to go ahead and create\n 19774 18:56:01,292 --> 18:56:04,292 And I'm just going to do a bunch of\n 19775 18:56:04,292 --> 18:56:07,432 so I'm not constantly typing all\n 19776 18:56:07,432 --> 18:56:09,411 because I want everything\nto be the same here 19777 18:56:09,411 --> 18:56:12,441 except I'm going to change my title\n 19778 18:56:12,441 --> 18:56:16,028 And inside of the body, I need a\n 19779 18:56:16,028 --> 18:56:18,111 And I don't really want\nto come up with some text. 19780 18:56:18,111 --> 18:56:21,682 So let me go to some random website\n 19781 18:56:21,682 --> 18:56:25,051 which if you're involved in like\n 19782 18:56:25,051 --> 18:56:28,581 this is placeholder text, kind of looks\n 19783 18:56:28,581 --> 18:56:32,121 Here, though, I have a handy way of\n 19784 18:56:32,122 --> 18:56:33,801 in something that looks like Latin. 19785 18:56:33,801 --> 18:56:36,012 And I've put those,\nnotice, inside of the body. 19786 18:56:37,012 --> 18:56:40,112 Look how long the\nmade-up words here are. 19787 18:56:40,111 --> 18:56:46,251 So let me go now into\nmy browser tab here. 19788 18:56:46,251 --> 18:56:50,301 Let me reload this page, and you'll\n 19789 18:56:50,301 --> 18:56:53,001 Paragraphs.html, which is\nmy new one, and Hello.html. 19790 18:56:53,001 --> 18:56:58,351 Let me click on Paragraphs.html,\n 19791 18:57:00,172 --> 18:57:03,172 SPEAKER 1: Yeah, it's obviously one\n 19792 18:57:03,172 --> 18:57:07,762 So that's interesting, but it's just a\n 19793 18:57:07,762 --> 18:57:09,142 It will only do what you say. 19794 18:57:09,142 --> 18:57:12,412 And each of these tags tells the\n 19795 18:57:12,411 --> 18:57:15,801 and then maybe stop doing something,\n 19796 18:57:15,801 --> 18:57:17,676 Hey, browser, here comes\nthe head of my page. 19797 18:57:17,676 --> 18:57:20,211 Hey, browser, here comes the\ntitle of my page, Hello, title. 19798 18:57:20,211 --> 18:57:22,042 Hey, browser, that's it for the title. 19799 18:57:22,042 --> 18:57:24,661 That's it for the head,\nhere comes the body tag. 19800 18:57:24,661 --> 18:57:27,411 So it's kind of having this\nconversation between the browser 19801 18:57:27,411 --> 18:57:30,471 between the HTML and the browser,\ndoing literally what it says. 19802 18:57:30,471 --> 18:57:32,721 So if you want a\nparagraph, you're probably 19803 18:57:32,721 --> 18:57:35,481 going to want to use\nthe P tag for paragraph. 19804 18:57:35,482 --> 18:57:38,642 And I'm going to go ahead\nand add this to my code. 19805 18:57:38,642 --> 18:57:41,392 I'm going to keep things neat,\n 19806 18:57:43,072 --> 18:57:47,032 Let me create another paragraph\ntag here, and close it 19807 18:57:47,032 --> 18:57:49,522 right after that one,\nindenting again, and I'm 19808 18:57:49,521 --> 18:57:51,232 keeping everything nice and orderly. 19809 18:57:53,402 --> 18:57:58,892 Let me indent that, and then let me\n 19810 18:57:58,892 --> 18:58:02,672 So again, a little tedious, but now I\n 19811 18:58:02,672 --> 18:58:04,101 hey, browser, start a paragraph. 19812 18:58:04,101 --> 18:58:05,721 Hey, browser, stop that paragraph. 19813 18:58:07,682 --> 18:58:09,961 Let me go back to the\nbrowser window here. 19814 18:58:09,961 --> 18:58:13,491 Let me hit Command R or\nControl R to reload the page. 19815 18:58:13,491 --> 18:58:16,732 And voila, now I have three\ncleaner paragraphs, all right? 19816 18:58:16,732 --> 18:58:18,562 So there's a P tag for paragraphs. 19817 18:58:18,562 --> 18:58:20,572 So now we have that\nparticular building block. 19818 18:58:20,572 --> 18:58:23,839 What if I want to add, for instance,\nsome headings to this page? 19819 18:58:23,839 --> 18:58:25,672 Well, that's something\nthat's possible, too. 19820 18:58:25,672 --> 18:58:29,152 Let me go ahead and create a\nnew file called Headings.html. 19821 18:58:29,152 --> 18:58:31,911 Let me copy and paste\nthat same code as before. 19822 18:58:31,911 --> 18:58:36,021 But now, let's preface each\nparagraph with maybe H1. 19823 18:58:36,021 --> 18:58:38,482 And I'm going to just\nwrite the word one. 19824 18:58:38,482 --> 18:58:41,452 And here I'm going to say H2, two. 19825 18:58:41,452 --> 18:58:44,572 And down here I might say H3, three. 19826 18:58:44,572 --> 18:58:49,184 So this is another tag,\nanother three tags, H1, H2, H3. 19827 18:58:49,184 --> 18:58:51,351 As you might have inferred\nby the file name I chose 19828 18:58:51,351 --> 18:58:55,342 this just gives you headings, like in\n 19829 18:58:55,342 --> 18:58:57,412 or subsections, or in\nan academic paper, you 19830 18:58:57,411 --> 18:59:00,182 have different hierarchies to\nthe text that you're writing. 19831 18:59:00,182 --> 18:59:04,461 So now that I've added an H1 tag,\n 19832 18:59:04,461 --> 18:59:07,491 two, H3 tag and the word three,\nlet's go back to the browser 19833 18:59:07,491 --> 18:59:12,232 reload the page again,\nand voila, once the page 19834 18:59:12,232 --> 18:59:17,152 reloads, I'll do it with the\nmanual button, reload the page. 19835 18:59:20,331 --> 18:59:21,623 AUDIENCE: Not in headings file. 19836 18:59:21,623 --> 18:59:23,581 SPEAKER 1: Right, I'm\nnot in the headings file. 19837 18:59:27,601 --> 18:59:29,301 OK, now we see some evidence of this. 19838 18:59:29,301 --> 18:59:30,921 Again, it's nonsensical content. 19839 18:59:30,922 --> 18:59:33,982 But you can kind of see that\nH1 is apparently big and bold 19840 18:59:33,982 --> 18:59:36,922 H2 is slightly less big, but still bold. 19841 18:59:36,922 --> 18:59:38,752 H3 is the same but a little smaller. 19842 18:59:38,751 --> 18:59:40,366 And it goes all the way down to H6. 19843 18:59:40,366 --> 18:59:42,741 After that, you should probably\nreorganize your thoughts. 19844 18:59:42,741 --> 18:59:44,631 But there are six\ndifferent hierarchies here 19845 18:59:44,631 --> 18:59:48,831 as you might use for chapters, sections,\n 19846 18:59:48,831 --> 18:59:53,061 So those are headings, as an\nHTML tag, in our vocabulary. 19847 18:59:53,062 --> 18:59:59,902 What's a common thing, too, well, let\n 19848 18:59:59,902 --> 19:00:04,851 and get some boilerplate here,\ncreate a file called List.html. 19849 19:00:04,851 --> 19:00:07,641 Let's create a simple\nlist inside of my body 19850 19:00:07,642 --> 19:00:10,522 and I'll give this a title of List. 19851 19:00:10,521 --> 19:00:13,531 And let me fix the title of this\none to be Headings, as well. 19852 19:00:13,532 --> 19:00:19,172 So in List.html, suppose I want to have\n 19853 19:00:19,172 --> 19:00:21,172 they're like a computer\nscientist's go-to words 19854 19:00:21,172 --> 19:00:23,302 just like a mathematician might say xyz. 19855 19:00:23,301 --> 19:00:26,122 Foo, bar, baths is in List.html. 19856 19:00:26,122 --> 19:00:29,991 Let me go back to my\nbrowser, hit the Back button. 19857 19:00:29,991 --> 19:00:33,921 There's List.html, and, hopefully,\n 19858 19:00:33,922 --> 19:00:37,892 on each line like a nice little\nlist, but, of course, I do not. 19859 19:00:38,961 --> 19:00:41,001 Chrome thinks it might be Arabic. 19860 19:00:41,001 --> 19:00:44,748 But that's curious, too,\nbecause the Lang attribute 19861 19:00:45,831 --> 19:00:48,023 So Google is trying to override it. 19862 19:00:48,024 --> 19:00:49,732 All right, what's the\nobvious explanation 19863 19:00:49,732 --> 19:00:52,762 why we're seeing foo, bar, and\nbaths on the same line, and not 19864 19:00:53,932 --> 19:00:55,342 AUDIENCE: We didn't tell it. 19865 19:00:55,342 --> 19:00:57,009 SPEAKER 1: We didn't tell it to do that. 19866 19:00:57,009 --> 19:00:59,272 So we need paragraph tags,\nor maybe something else. 19867 19:00:59,271 --> 19:01:00,921 Turns out there is something else. 19868 19:01:00,922 --> 19:01:04,972 There is a UL tag, for an\nunordered list in HTML 19869 19:01:04,971 --> 19:01:08,596 inside of which you can\nhave LI tags, for list item 19870 19:01:08,596 --> 19:01:10,221 inside of which you can put your words. 19871 19:01:10,221 --> 19:01:13,861 So there's my foo, there's\nmy bar, there's my baths. 19872 19:01:13,861 --> 19:01:16,671 And, again, notice that VS Code\nis finishing my thought for me. 19873 19:01:16,672 --> 19:01:21,412 But notice the hierarchy, open\nUL, open LI, close LI, open LI 19874 19:01:21,411 --> 19:01:24,771 close LI, open LI, close LI, close UL. 19875 19:01:24,771 --> 19:01:27,141 So it's sort of done\nin reverse order here. 19876 19:01:27,142 --> 19:01:33,172 Let me go back to my browser, reload\n 19877 19:01:33,172 --> 19:01:36,382 a default bulleted list, that\nstill seems to be in Arabic. 19878 19:01:36,381 --> 19:01:38,152 What if I want this list to be numbered? 19879 19:01:38,152 --> 19:01:39,801 Well, you can probably guess. 19880 19:01:39,801 --> 19:01:43,372 If you don't want an unordered list, but\n 19881 19:01:44,661 --> 19:01:46,641 SPEAKER 1: OL, sure, so let's try that. 19882 19:01:46,642 --> 19:01:49,312 Not always that easy as just\nguessing, but in this case 19883 19:01:49,312 --> 19:01:51,062 OL is going to do the trick. 19884 19:01:51,062 --> 19:01:52,582 Let me go back to my other browser. 19885 19:01:52,581 --> 19:01:55,664 Let me reload the page, and now it's\n 19886 19:01:55,664 --> 19:01:58,111 It's a tiny thing, but\nthis is actually useful 19887 19:01:58,111 --> 19:02:00,451 if you have a very long\nlist of data, and maybe you 19888 19:02:00,452 --> 19:02:02,612 might add some things in the\nmiddle, the beginning, or the end. 19889 19:02:02,611 --> 19:02:04,944 It would just be annoying to\nhave to go and renumber it. 19890 19:02:04,945 --> 19:02:07,772 The computer is doing it\nfor us by, instead, just 19891 19:02:07,771 --> 19:02:10,152 numbering from top to bottom here. 19892 19:02:10,152 --> 19:02:12,211 All right, what about\nanother type of layout 19893 19:02:12,211 --> 19:02:14,959 not just paragraphs, not just\n 19894 19:02:14,959 --> 19:02:17,042 You've got some research\ndata you want to present 19895 19:02:17,042 --> 19:02:20,334 some financial data you want to present,\n 19896 19:02:20,334 --> 19:02:23,222 How might we go about laying\nout data, a la a table? 19897 19:02:23,221 --> 19:02:26,221 Well, let me create a\nfile called Table.html 19898 19:02:26,221 --> 19:02:28,921 and I'll just copy paste\nwhere we started earlier. 19899 19:02:28,922 --> 19:02:31,292 Let me start to close\nsome of these other files. 19900 19:02:31,292 --> 19:02:34,688 And in Table.html, this is\ngoing to be a bit more HTML 19901 19:02:34,688 --> 19:02:36,271 but I'm going to go ahead and do this. 19902 19:02:36,271 --> 19:02:40,592 Table and close table, tables\ncan have table headings. 19903 19:02:40,592 --> 19:02:45,881 So T head is the name of that tag, and\n 19904 19:02:45,881 --> 19:02:47,161 So I'm going to add that tag. 19905 19:02:47,161 --> 19:02:49,619 And this is a common technique,\nsort of start your thought 19906 19:02:49,619 --> 19:02:52,721 finish your thought, and then go\n 19907 19:02:52,721 --> 19:02:54,301 What do I want to put in this table? 19908 19:02:54,301 --> 19:02:58,391 How about a bunch of names and numbers. 19909 19:02:58,392 --> 19:03:02,352 So, for instance, like left\ncolumn name, right column number. 19910 19:03:02,351 --> 19:03:05,641 So let's create a table row,\nwith what's called the TR tag. 19911 19:03:05,642 --> 19:03:10,382 Let's create a table heading with\n 19912 19:03:10,381 --> 19:03:14,262 Let's create another table\nheading called number here. 19913 19:03:14,262 --> 19:03:17,612 And all of that, to be\nclear, is in one table row. 19914 19:03:17,611 --> 19:03:21,331 Meanwhile, in the table body,\nlet me create another table row 19915 19:03:21,331 --> 19:03:23,101 but this time, it's not a heading. 19916 19:03:23,101 --> 19:03:24,581 Now I'm in the guts of my table. 19917 19:03:24,581 --> 19:03:28,171 Let's do table data, which is synonymous\n 19918 19:03:28,172 --> 19:03:30,872 in like an Excel spreadsheet\nor Google spreadsheet. 19919 19:03:30,872 --> 19:03:33,722 In this TD, I'm going to\nsay like Carter's name 19920 19:03:33,721 --> 19:03:39,481 and then lets grab Carter's number\n 19921 19:03:39,482 --> 19:03:43,211 Then let's put me into the mix, and\n 19922 19:03:44,342 --> 19:03:47,461 But we'll see that there's a lot\nof shared structure with HTML. 19923 19:03:47,461 --> 19:03:53,771 Let me go ahead and do mine,\n 19924 19:03:53,771 --> 19:03:56,191 So we're getting to be\na lot of indentation. 19925 19:03:56,191 --> 19:03:59,711 I'm using four spaces by default.\n 19926 19:03:59,711 --> 19:04:02,191 So long as you're consistent,\nthat's considered good style. 19927 19:04:02,191 --> 19:04:04,981 But let me go back to my\nbrowser here, and hit back. 19928 19:04:04,982 --> 19:04:07,232 That then brings me to my\ndirectory listing again. 19929 19:04:07,232 --> 19:04:10,562 Here's Table.html, and this\nis not that interesting yet. 19930 19:04:10,562 --> 19:04:13,532 But you can see that there's\ntwo columns, name and number. 19931 19:04:13,532 --> 19:04:18,302 Because it's a table heading, TH,\n 19932 19:04:18,301 --> 19:04:22,111 In there, in the table, are two\n 19933 19:04:22,111 --> 19:04:25,261 It's a little, oh, I forgot my\nnumber one, sorry about that. 19934 19:04:25,262 --> 19:04:28,197 One and one, it's not the\nprettiest table, right? 19935 19:04:28,197 --> 19:04:30,072 I feel like I kind of\nwant to separate things 19936 19:04:30,072 --> 19:04:32,154 a little more, maybe put\nsome borders or the like. 19937 19:04:32,154 --> 19:04:36,281 But with HTML alone, I'm really\nfocusing on the structure alone. 19938 19:04:36,282 --> 19:04:37,902 So we'll make this prettier soon. 19939 19:04:37,902 --> 19:04:41,331 But for now, this is how you\nmight lay out tabular data. 19940 19:04:41,331 --> 19:04:44,081 All right, let me pause here just\n 19941 19:04:44,081 --> 19:04:46,411 But, again, the goal right now\nis just to kind of throw at you 19942 19:04:46,411 --> 19:04:49,801 some basic building blocks, that, again,\n 19943 19:04:49,801 --> 19:04:53,671 But we're going to start\nstylizing these things soon, too. 19944 19:04:55,691 --> 19:04:57,387 SPEAKER 1: How do you indent paragraphs? 19945 19:04:58,262 --> 19:04:59,822 For that, we'll probably\ngoing to want something 19946 19:04:59,822 --> 19:05:01,308 called CSS, Cascading Style Sheets. 19947 19:05:01,308 --> 19:05:03,391 So let me come back to\nthat, in just a little bit. 19948 19:05:03,392 --> 19:05:06,782 For the stylization of these things,\n 19949 19:05:06,782 --> 19:05:10,172 we're going to need a\ndifferent language altogether. 19950 19:05:10,172 --> 19:05:13,172 All right, well, let's\nnow create what the web 19951 19:05:13,172 --> 19:05:17,702 is full of, which is like\nphotographs and images and the like. 19952 19:05:17,702 --> 19:05:22,832 Let me go ahead and create a new file\n 19953 19:05:22,831 --> 19:05:25,781 and change the title\nhere to be, say, Image. 19954 19:05:25,782 --> 19:05:29,189 And then, in the body of this page,\n 19955 19:05:29,188 --> 19:05:31,771 The interesting thing about an\nimage is that it's actually not 19956 19:05:31,771 --> 19:05:35,251 going to have a start tag and an end\n 19957 19:05:35,251 --> 19:05:38,281 Like, how can you start an image\nand then eventually finish it? 19958 19:05:38,282 --> 19:05:39,902 It's either there or it isn't. 19959 19:05:39,902 --> 19:05:42,542 So some tags do not have end tags. 19960 19:05:42,542 --> 19:05:49,081 So let me do image, IMG,\nsource equals Harvard.jpeg. 19961 19:05:49,081 --> 19:05:51,902 And let me go ahead, and,\nin my terminal window 19962 19:05:51,902 --> 19:05:54,301 I actually came with a photo of Harvard. 19963 19:05:54,301 --> 19:05:57,751 Let me grab this for just a second. 19964 19:05:57,751 --> 19:06:01,171 Let me grab Harvard.jpeg and\nput it into my directory 19965 19:06:01,172 --> 19:06:03,822 pretend that I downloaded\nthat in advance. 19966 19:06:03,822 --> 19:06:06,422 And so I'm referring\nto now a file called 19967 19:06:06,422 --> 19:06:12,032 Harvard.jpeg, that apparently is in\n 19968 19:06:12,032 --> 19:06:15,211 If this image were on the\ninternet, like Harvard server 19969 19:06:15,211 --> 19:06:21,402 I could also say like\nHTTPS://www.Harvard.edu/FolderName 19970 19:06:21,402 --> 19:06:25,562 whatever it is, /Harvard.jpeg, but\n 19971 19:06:25,562 --> 19:06:28,772 to your own, the Scode environment,\nlike I did before class 19972 19:06:28,771 --> 19:06:32,072 by dragging and dropping this\nwhole file, this photo of Harvard 19973 19:06:32,072 --> 19:06:35,402 you can just refer to it\nrelatively, so to speak. 19974 19:06:35,402 --> 19:06:38,521 This would be the same thing\nas saying ./Harvard.jpeg 19975 19:06:38,521 --> 19:06:41,941 go to the current directory and\n 19976 19:06:41,941 --> 19:06:43,861 But that's unnecessary to type. 19977 19:06:43,861 --> 19:06:47,461 For accessibility purposes, though,\n 19978 19:06:47,461 --> 19:06:51,122 it's ideal if we also give this\nan alternative text, something 19979 19:06:51,122 --> 19:06:55,021 like Harvard University,\nin the so-called Alt tag 19980 19:06:55,021 --> 19:06:57,331 and this is so that\nscreen readers will recite 19981 19:06:57,331 --> 19:06:59,581 what it is the photo is,\nfor folks who can't see it. 19982 19:06:59,581 --> 19:07:02,072 And if you're just on a slow\nconnection, sometimes you'll 19983 19:07:02,072 --> 19:07:04,051 see the text of what\nyou're about to see 19984 19:07:04,051 --> 19:07:07,601 before the image itself downloads,\n 19985 19:07:07,601 --> 19:07:12,241 So let's now go back to my open browser\n 19986 19:07:12,241 --> 19:07:16,531 I now have Harvard.jpeg, which I\n 19987 19:07:16,532 --> 19:07:20,252 Let me click on Image.html,\nand here we have 19988 19:07:20,251 --> 19:07:25,171 a really big picture of Memorial\n 19989 19:07:25,172 --> 19:07:29,762 Suffice it to say I should probably fix\n 19990 19:07:29,762 --> 19:07:33,271 But to do that, we're going to probably\n 19991 19:07:33,271 --> 19:07:36,842 There are some historical\nattributes that you can still 19992 19:07:36,842 --> 19:07:39,283 use to control width and\nheight, and so forth. 19993 19:07:39,283 --> 19:07:41,491 But we're going to do it\nthe better way, so to speak 19994 19:07:41,491 --> 19:07:44,042 with a language designed for just that. 19995 19:07:45,482 --> 19:07:50,402 I also came prepared with,\nlet me grab another file here 19996 19:07:50,402 --> 19:07:55,782 let me grab a file called\nHalloween.mp4, which is an MPEG file. 19997 19:07:55,782 --> 19:08:02,822 And let me go ahead and change this\n 19998 19:08:02,822 --> 19:08:04,592 I'll change my title to be Video. 19999 19:08:04,592 --> 19:08:07,982 And let's go ahead and now\nintroduce another tag, a video tag 20000 19:08:07,982 --> 19:08:13,172 open bracket video, and then let me go\n 20001 19:08:13,172 --> 19:08:17,461 And then inside of the video tag,\n 20002 19:08:17,461 --> 19:08:22,501 is going to be specifically\n 20003 19:08:22,501 --> 19:08:27,511 I know, is Video/mp4, because I looked\n 20004 19:08:27,512 --> 19:08:29,702 And the video tag actually\nhas a few attributes. 20005 19:08:29,702 --> 19:08:31,711 I can have this thing autoplay. 20006 19:08:33,392 --> 19:08:36,602 I can mute it, so that there's no\n 20007 19:08:36,601 --> 19:08:41,381 Most browsers, to prevent ads, don't\n 20008 19:08:41,381 --> 19:08:44,581 So if you mute your video, it\nwill autoplay, but presumably not 20009 19:08:45,601 --> 19:08:49,932 And let me set the width of this thing\n 20010 19:08:49,932 --> 19:08:51,581 But I can make it any size I want. 20011 19:08:51,581 --> 19:08:55,292 So I know this just from having\n 20012 19:08:57,122 --> 19:08:59,822 Sometimes attributes don't have values. 20013 19:09:01,051 --> 19:09:04,387 They're just single words,\nautoplay, loop, muted 20014 19:09:04,387 --> 19:09:06,512 and that kind of makes\nsense for any attribute that 20015 19:09:08,342 --> 19:09:11,372 Like, it doesn't make sense\nto say muted equals something. 20016 19:09:11,372 --> 19:09:12,722 Like it's either muted or not. 20017 19:09:12,721 --> 19:09:14,322 The attribute is there or not. 20018 19:09:14,322 --> 19:09:16,361 Similarly, for these others, as well. 20019 19:09:16,361 --> 19:09:19,981 So let me go back to my other browser\n 20020 19:09:19,982 --> 19:09:24,122 There is both my mp4\nand also Video.html 20021 19:09:24,122 --> 19:09:26,202 which is the web page that embeds it. 20022 19:09:26,202 --> 19:09:29,792 And this is actually a video that was\n 20023 19:09:30,941 --> 19:09:35,221 So we included it in this demo here. 20024 19:09:35,221 --> 19:09:41,792 This is the video that was on\n 20025 19:09:41,792 --> 19:09:44,009 But you can see here that\nan image alone probably 20026 19:09:44,009 --> 19:09:45,301 would not have the same effect. 20027 19:09:45,301 --> 19:09:48,494 This is actually a movie, a small\nvideo file that's now looping. 20028 19:09:48,494 --> 19:09:51,661 Now there's some artifacts here, like\n 20029 19:09:51,661 --> 19:09:53,494 I feel like it'd be\nnice to fill the screen. 20030 19:09:53,494 --> 19:09:58,031 But again, we'll come back to a language\n 20031 19:09:58,032 --> 19:10:00,272 Well, it's not just\nvideos like this, that you 20032 19:10:00,271 --> 19:10:01,982 might want to put into a web page. 20033 19:10:01,982 --> 19:10:06,283 Let me create another\nfile called iFrame.html. 20034 19:10:06,283 --> 19:10:09,241 If you've ever poked around with, if\n 20035 19:10:09,241 --> 19:10:12,565 or if you had your own blog or\n 20036 19:10:12,565 --> 19:10:14,732 you might have been in the\nhabit of embedding videos 20037 19:10:14,732 --> 19:10:17,672 in websites, using like\nembedded YouTube players. 20038 19:10:17,672 --> 19:10:21,362 Well, this is possible, too, using\n 20039 19:10:22,411 --> 19:10:25,531 And an iFrame is just a tag\nthat is literally iFrame. 20040 19:10:25,532 --> 19:10:28,960 It has source equals,\nand then a URL, and if it 20041 19:10:28,960 --> 19:10:32,252 happens to be a YouTube video, there's\n 20042 19:10:32,251 --> 19:10:33,811 per YouTube's documentation. 20043 19:10:33,812 --> 19:10:41,652 So you might do www.youtube.com, embed,\n 20044 19:10:41,652 --> 19:10:47,191 So this is essentially what we do, if\n 20045 19:10:47,191 --> 19:10:50,402 videos, in the course's website, or\n 20046 19:10:50,402 --> 19:10:53,911 If I want to allow full screen,\nI can add this attribute, too 20047 19:10:53,911 --> 19:10:56,661 that I know exists, by just\nhaving checked the documentation. 20048 19:10:56,661 --> 19:11:00,720 And if I now go back to my browser\n 20049 19:11:02,039 --> 19:11:03,872 It's not going to fill\nthe screen, because I 20050 19:11:03,872 --> 19:11:05,760 haven't customized the aesthetics yet. 20051 19:11:05,760 --> 19:11:10,111 But it does seem to embed a tiny little\n 20052 19:11:10,721 --> 19:11:13,801 So we could change the width, change\n 20053 19:11:14,471 --> 19:11:19,171 But an iFrame is a way of embedding\n 20054 19:11:19,171 --> 19:11:21,961 page, if they allow\nit, so as to create all 20055 19:11:21,961 --> 19:11:25,652 the more of an interactive experience\n 20056 19:11:25,652 --> 19:11:28,652 All right, well, the web is, of\n 20057 19:11:28,652 --> 19:11:32,072 Let's go ahead and create\na file called Link.html. 20058 19:11:32,072 --> 19:11:35,581 And if we want to create a web page that\n 20059 19:11:35,581 --> 19:11:40,100 else, let's go ahead and do this,\n 20060 19:11:44,551 --> 19:11:48,182 Now, in like Facebook, Instagram, a lot\n 20061 19:11:48,182 --> 19:11:50,851 in a domain name, or a\nfully qualified domain name 20062 19:11:50,851 --> 19:11:52,771 it automatically becomes a link. 20063 19:11:52,771 --> 19:11:56,161 That's because those websites have\n 20064 19:11:56,161 --> 19:12:00,550 detects something that looks like a\n 20065 19:12:00,551 --> 19:12:02,452 HTML itself does not do that for you. 20066 19:12:02,452 --> 19:12:06,822 And so if I go back to my web\npage here, click on Link.html 20067 19:12:06,822 --> 19:12:09,851 if you type visit\nHarvard.edu period, that's 20068 19:12:09,851 --> 19:12:11,592 all you're literally going to see. 20069 19:12:11,592 --> 19:12:15,192 But instinctively, even if you've\n 20070 19:12:15,191 --> 19:12:19,326 we probably do here\nto solve this problem? 20071 19:12:19,327 --> 19:12:20,952 What could we do to solve this problem. 20072 19:12:20,952 --> 19:12:22,244 What do I probably want to add. 20073 19:12:24,461 --> 19:12:27,702 SPEAKER 1: Yeah, so I want to surround\n 20074 19:12:27,702 --> 19:12:30,285 And you wouldn't necessarily\nknow this until someone told you 20075 19:12:30,285 --> 19:12:33,342 or you looked it up, but the tag for\n 20076 19:12:33,342 --> 19:12:35,832 called the A tag for anchor. 20077 19:12:35,831 --> 19:12:39,251 It has an attribute called\nHREF for hyper-reference 20078 19:12:39,251 --> 19:12:42,791 which is like a link in\nthe virtual world to a URL. 20079 19:12:42,792 --> 19:12:45,881 So let me type in Harvard's\nfull and proper URL here. 20080 19:12:45,881 --> 19:12:48,101 Then I'm going to close the tag. 20081 19:12:48,101 --> 19:12:53,411 And then I can still say Harvard.edu,\n 20082 19:12:53,411 --> 19:12:58,961 But the place they're going to go\n 20083 19:13:00,971 --> 19:13:03,281 Now if I go back here\nand reload the page 20084 19:13:03,282 --> 19:13:05,322 now it automatically gets underlined. 20085 19:13:05,322 --> 19:13:07,092 It happens to be purple by default. Why? 20086 19:13:07,092 --> 19:13:09,292 Because we visited\nHarvard.edu a few minutes ago. 20087 19:13:09,292 --> 19:13:12,642 So my browser, by default, is indicating\n 20088 19:13:12,642 --> 19:13:14,592 But now I have a link\nthat I can click on 20089 19:13:14,592 --> 19:13:18,881 and if I hover over it but don't click,\n 20090 19:13:18,881 --> 19:13:22,842 there's a little clue as to where\n 20091 19:13:24,012 --> 19:13:26,082 And without going too\nfar down a rabbit hole 20092 19:13:26,081 --> 19:13:29,471 but to tie together our discussion\nof cybersecurity recently 20093 19:13:29,471 --> 19:13:32,861 what if I were to do\nsomething like this. 20094 19:13:32,861 --> 19:13:37,331 Right now you have the beginnings\nof a phishing attack of sorts 20095 19:13:37,331 --> 19:13:42,281 P-H-I-S-H-I-N-G, whereby you can\n 20096 19:13:42,282 --> 19:13:46,542 even an email using HTML, that tells\n 20097 19:13:46,542 --> 19:13:49,612 but they're really going to\ngo someplace else altogether. 20098 19:13:49,611 --> 19:13:52,121 And that is the essence of\nphishing attacks these days. 20099 19:13:52,122 --> 19:13:55,452 If you've ever gotten a bogus\nemail pretending to be from PayPal 20100 19:13:55,452 --> 19:13:57,912 or your bank or some\nother website, odds are 20101 19:13:57,911 --> 19:14:00,432 they've just written HTML\nthat says whatever they want 20102 19:14:00,432 --> 19:14:04,042 but the underlying tags might\ndo something very different. 20103 19:14:04,042 --> 19:14:06,792 And so having the instinct to look\n 20104 19:14:06,792 --> 19:14:10,211 or be a little suspicious when you're\n 20105 19:14:10,211 --> 19:14:13,601 it's this easy to socially\nengineer people, that is 20106 19:14:13,601 --> 19:14:18,342 deceive them, by just saying one\nthing and linking to another. 20107 19:14:18,342 --> 19:14:22,572 Well, what if I want to link my page\n 20108 19:14:22,572 --> 19:14:25,422 Well, if I want to link\nto that photo of Harvard 20109 19:14:25,422 --> 19:14:28,722 I can just do HREF = equals quote\n 20110 19:14:28,721 --> 19:14:32,121 in my same account, that\nis itself a web page. 20111 19:14:32,122 --> 19:14:35,682 So this is how you can create\nrelative links, multi-page web 20112 19:14:35,682 --> 19:14:38,271 pages, multi-page websites, yourself. 20113 19:14:38,271 --> 19:14:41,652 So if I now reload this\npage, hover over Harvard.edu 20114 19:14:41,652 --> 19:14:45,191 you'll see in the bottom left\nhand corner a very long URL. 20115 19:14:45,191 --> 19:14:48,101 But that's because I'm in code\nspaces right now, VS Code 20116 19:14:48,101 --> 19:14:51,762 and it's appending automatically\nto the end of my current URL 20117 19:14:54,822 --> 19:14:57,251 When I click on this, I\ngo immediately to that 20118 19:14:57,251 --> 19:15:01,151 file we created earlier, with a\ncrazy, big version of the image. 20119 19:15:01,152 --> 19:15:03,822 But that's just a way\nthat one page on a website 20120 19:15:03,822 --> 19:15:07,721 can link to another page on a website. 20121 19:15:07,721 --> 19:15:11,711 Let's do one other thing here,\nmaking things more responsive 20122 19:15:11,711 --> 19:15:14,622 because, in fact, that wasn't a\nparticularly responsive website. 20123 19:15:14,622 --> 19:15:17,702 Responsive means responding to the\n 20124 19:15:17,702 --> 19:15:20,202 is so important when someone\nmight be on a screen like this 20125 19:15:20,202 --> 19:15:22,032 or on a screen like this these days. 20126 19:15:22,032 --> 19:15:26,892 There are special tags we can use to\n 20127 19:15:28,271 --> 19:15:32,381 So let me create a file\ncalled Responsive.html. 20128 19:15:32,381 --> 19:15:36,131 I'm going to copy/paste some starting\n 20129 19:15:36,131 --> 19:15:40,751 And let me go ahead and just grab, let\n 20130 19:15:40,751 --> 19:15:46,341 from before, just so that we have a\n 20131 19:15:46,342 --> 19:15:50,172 And let me go ahead and\ngrab this text here. 20132 19:15:50,172 --> 19:15:53,632 And I'm just going to paste\nthis into the body of this page. 20133 19:15:54,411 --> 19:15:57,762 So I just have a big paragraph,\n 20134 19:15:57,762 --> 19:15:59,532 Let me go back to my browser. 20135 19:15:59,532 --> 19:16:02,622 Let me open up this file,\ncalled Responsive.html 20136 19:16:02,622 --> 19:16:05,442 to make the point that\nit is not yet responsive. 20137 19:16:05,441 --> 19:16:07,781 Let me go ahead and\nclick on Responsive.html. 20138 19:16:08,961 --> 19:16:12,381 But here's another trick you can do,\n 20139 19:16:12,881 --> 19:16:14,921 You can pretend to be another device. 20140 19:16:14,922 --> 19:16:19,332 Let me go to View, developer,\ndeveloper tools again. 20141 19:16:19,331 --> 19:16:21,779 Last time we used this to\nuse the Network tab, which 20142 19:16:21,779 --> 19:16:24,822 was kind of interesting, because we\n 20143 19:16:25,822 --> 19:16:29,111 But notice, we can also click on\nthis icon, in Chrome, at least 20144 19:16:29,111 --> 19:16:31,151 that looks like a mobile phone. 20145 19:16:31,152 --> 19:16:36,199 I can turn my laptop into what looks\n 20146 19:16:36,198 --> 19:16:39,281 I'm going to click the dot dot dot\n 20147 19:16:39,282 --> 19:16:41,502 Instead of on the bottom,\nwhere it might be by default 20148 19:16:41,501 --> 19:16:43,281 I'm going to move it\nto the right hand side. 20149 19:16:43,282 --> 19:16:45,312 So that now on the left,\nyou see what looks more 20150 19:16:45,312 --> 19:16:47,082 like the shape of a vertical phone. 20151 19:16:47,081 --> 19:16:49,841 And, in fact, if I go\nto my dimensions here 20152 19:16:49,842 --> 19:16:53,741 I'll choose something like\niPhone X, so a few years back. 20153 19:16:53,741 --> 19:16:57,661 Here's what that same website might\n 20154 19:16:57,661 --> 19:17:01,322 that looks pretty damn\nsmall, to be able to read it. 20155 19:17:01,322 --> 19:17:03,842 And that's because the\nwebsite has not automatically 20156 19:17:03,842 --> 19:17:07,771 responded to the fairly narrow\ndimensions of the iPhone 20157 19:17:07,771 --> 19:17:09,971 in question, or Android\ndevice, or whatnot. 20158 19:17:09,971 --> 19:17:11,411 So let me go ahead and do this. 20159 19:17:11,411 --> 19:17:13,391 Let me go back into my code. 20160 19:17:13,392 --> 19:17:16,382 And let me go into the head of\nthe page, and for the first time 20161 19:17:18,149 --> 19:17:20,191 This word is now all over\nthe internet, but there 20162 19:17:20,191 --> 19:17:23,131 is a metatag that is\ncalled, that allows you 20163 19:17:23,131 --> 19:17:27,009 to specify the name of some kind\nof configuration detail here 20164 19:17:28,051 --> 19:17:31,081 Viewport is the technical term\nfor the rectangular region 20165 19:17:31,081 --> 19:17:32,741 that the human sees in a browser. 20166 19:17:32,741 --> 19:17:35,491 It's essentially the body of the\n 20167 19:17:37,051 --> 19:17:39,751 And you can specify the\ncontent of the viewport 20168 19:17:39,751 --> 19:17:41,761 should have an initial scale of 1. 20169 19:17:41,762 --> 19:17:43,652 So it shouldn't be zoomed in or out. 20170 19:17:43,652 --> 19:17:46,292 And the width that the\nbrowser should assume 20171 19:17:46,292 --> 19:17:48,812 should be equal to the device's width. 20172 19:17:48,812 --> 19:17:51,722 These are sort of magical statements\nthat you just have to know 20173 19:17:51,721 --> 19:17:55,771 or copy/paste or transcribe, that\njust express, to the browser 20174 19:17:55,771 --> 19:17:59,611 assume that the width of the page is the\n 20175 19:17:59,611 --> 19:18:03,451 Don't assume the luxury of a\nbig laptop or desktop computer. 20176 19:18:03,452 --> 19:18:08,432 Now, making only that change, let\n 20177 19:18:08,432 --> 19:18:10,682 here, using Chrome's developer tools. 20178 19:18:13,232 --> 19:18:18,722 And now, it's not very effective on this\n 20179 19:18:23,501 --> 19:18:28,351 So if I zoom in to 100%, this would be\n 20180 19:18:28,351 --> 19:18:30,372 readable than it would\nhave been a moment ago 20181 19:18:30,372 --> 19:18:33,039 even though I realized that demo\nwas not necessarily persuasive. 20182 19:18:33,039 --> 19:18:34,832 But it's as simple as\ntelling the browser 20183 19:18:34,831 --> 19:18:38,801 to resize the thing to\nthe width of the page. 20184 19:18:38,801 --> 19:18:41,971 All right, let me pause here to see\n 20185 19:18:41,971 --> 19:18:43,591 feels like enough HTML tags. 20186 19:18:43,592 --> 19:18:45,312 We'll add just a couple of more in. 20187 19:18:45,312 --> 19:18:47,402 But for the most part,\nlike HTML tags are 20188 19:18:47,402 --> 19:18:51,241 things you Google and figure out over\n 20189 19:18:51,241 --> 19:18:54,182 The basic building blocks\nare tags, attributes. 20190 19:18:54,182 --> 19:18:55,652 Some attributes have values. 20191 19:18:56,461 --> 19:19:00,175 And that's sort of the\nstructure of HTML in essence. 20192 19:19:00,175 --> 19:19:01,592 Questions on any of these, though. 20193 19:19:02,330 --> 19:19:03,961 AUDIENCE: Do attributes have an order? 20194 19:19:03,961 --> 19:19:05,611 SPEAKER 1: Do attributes have an order? 20195 19:19:05,611 --> 19:19:08,701 No, attributes can be in any\norder, from left to right. 20196 19:19:08,702 --> 19:19:11,702 I tend to be a little nit-picky,\nand so I alphabetize them 20197 19:19:11,702 --> 19:19:14,917 if only because then I can easily\nspot if something's missing 20198 19:19:14,917 --> 19:19:16,292 if it's not there alphabetically. 20199 19:19:16,292 --> 19:19:21,952 Most people on the internet\ndon't seem to do that. 20200 19:19:24,801 --> 19:19:26,811 I mentioned that HTML\nis starting to replace 20201 19:19:26,812 --> 19:19:28,851 other languages for user interfaces. 20202 19:19:28,851 --> 19:19:30,292 And it's not just HTML alone. 20203 19:19:30,292 --> 19:19:34,101 It's HTML with CSS, with JavaScript,\n 20204 19:19:35,211 --> 19:19:37,911 That rather has been the\ntrend for portability 20205 19:19:37,911 --> 19:19:40,581 and the ability for companies,\nfor individual programmers 20206 19:19:40,581 --> 19:19:42,951 to write one version\nof an app and have it 20207 19:19:42,952 --> 19:19:47,092 work on Android devices and iPhones\n 20208 19:19:48,351 --> 19:19:51,411 It is very time-consuming to\nlearn a language like Java 20209 19:19:51,411 --> 19:19:54,322 and write an Android app, learn\nanother language called Swift 20210 19:19:54,322 --> 19:19:56,991 and make an iOS app, not to\nmention make them look and behave 20211 19:19:56,991 --> 19:19:59,123 the same, not to\nmention fix a bug in one 20212 19:19:59,123 --> 19:20:00,831 and then remember to\nfix it in the other. 20213 19:20:00,831 --> 19:20:05,061 I mean, this is just very painful\nand time-consuming and costly. 20214 19:20:05,062 --> 19:20:09,351 So this standardization on\nHTML, CSS, and JavaScript 20215 19:20:09,351 --> 19:20:13,432 even for mobile apps and web apps,\n 20216 19:20:13,432 --> 19:20:16,012 because it solves problems like that. 20217 19:20:16,012 --> 19:20:19,851 All right, so let's go ahead and now do\n 20218 19:20:19,851 --> 19:20:22,161 All of these pages thus\nfar are really just tastes 20219 19:20:22,161 --> 19:20:25,042 of static content, content\nthat does not change. 20220 19:20:25,042 --> 19:20:27,991 Well, let's go ahead and do this. 20221 19:20:27,991 --> 19:20:31,012 Let me introduce one other\nformat of URLs, which looks 20222 19:20:31,012 --> 19:20:33,051 a little something like it did before. 20223 19:20:33,051 --> 19:20:36,451 So slash path, but it could\nactually be something like this 20224 19:20:36,452 --> 19:20:40,012 slash path question\nmark, key equals value. 20225 19:20:40,012 --> 19:20:42,021 You might not have noticed,\nor cared to notice 20226 19:20:42,021 --> 19:20:44,641 the URLs in your URL bar every day. 20227 19:20:44,642 --> 19:20:46,282 But these things are everywhere. 20228 19:20:46,282 --> 19:20:49,222 Often when you type into a\nsearch engine like Google 20229 19:20:49,221 --> 19:20:52,491 a search query, whatever you\njust typed ends up in the URL. 20230 19:20:52,491 --> 19:20:55,099 When you click on a link that\ncontains some information 20231 19:20:55,099 --> 19:20:57,682 there might be a question mark,\nand then some keys and values. 20232 19:20:57,682 --> 19:21:00,471 There might be an ampersand\nand more keys and values. 20233 19:21:00,471 --> 19:21:02,901 Here, again, is that very\ncommon programming paradigm 20234 19:21:02,902 --> 19:21:05,062 of just associating keys with values. 20235 19:21:07,021 --> 19:21:11,391 Let me actually go to\ngoogle.com, in a browser 20236 19:21:11,392 --> 19:21:16,042 here, and let me search for something\n 20237 19:21:16,042 --> 19:21:20,482 Enter, notice now that my\nURL changed from google.com 20238 19:21:20,482 --> 19:21:23,991 to google.com slash\nsearch question mark 20239 19:21:23,991 --> 19:21:27,081 Q equals cats, ampersand\nand then a bunch of stuff 20240 19:21:27,081 --> 19:21:28,581 that I don't understand or know. 20241 19:21:28,581 --> 19:21:33,501 So let's just delete it for now, and\n 20242 19:21:34,559 --> 19:21:37,101 If I zoom out here, years ago\nyou would get pictures of cats. 20243 19:21:37,101 --> 19:21:41,122 Now you get videos of the movie. 20244 19:21:41,122 --> 19:21:44,332 And then that top query\nthere, is Cats a bad movie. 20245 19:21:44,331 --> 19:21:46,822 But we can also, of\ncourse, click on Images. 20246 19:21:46,822 --> 19:21:49,521 And there are the\nadorable cat, creepy cats. 20247 19:21:49,521 --> 19:21:52,342 All right, this didn't used to\nhappen when we searched for cats. 20248 19:21:52,342 --> 19:21:57,922 But anyhow, the point is that the URL\n 20249 19:21:57,922 --> 19:22:00,562 And this is such a simple,\nbut such a powerful thing. 20250 19:22:00,562 --> 19:22:04,822 This is how humans\nprovide input to servers. 20251 19:22:04,822 --> 19:22:07,741 They don't manually create the\nURLs, like I sort of just did. 20252 19:22:07,741 --> 19:22:10,792 But when you fill out a form\non the web and you hit Enter 20253 19:22:10,792 --> 19:22:13,672 typically the URL suddenly\nchanges to include 20254 19:22:13,672 --> 19:22:16,552 whatever you typed in,\nin the URL, assuming 20255 19:22:16,551 --> 19:22:18,891 the form is using the verb GET. 20256 19:22:20,331 --> 19:22:22,123 If you're typing in a\nusername, a password 20257 19:22:22,123 --> 19:22:25,331 a credit card information, because you\n 20258 19:22:25,331 --> 19:22:27,951 at your laptop to see literally\neverything you typed in 20259 19:22:29,331 --> 19:22:32,061 So there's another verb, POST,\nthat can hide all of that. 20260 19:22:32,062 --> 19:22:33,772 And it's just sent a little differently. 20261 19:22:33,771 --> 19:22:36,501 But things like this are\ntypically sent via GET 20262 19:22:36,501 --> 19:22:39,921 and what that means underneath the\n 20263 19:22:39,922 --> 19:22:43,432 making a request like this, Get/search? 20264 19:22:43,432 --> 19:22:47,752 Q equals, whatever you typed in, the\n 20265 19:22:47,751 --> 19:22:52,822 And hopefully what comes back is a page\n 20266 19:22:52,822 --> 19:22:57,961 And what's interesting here now is, if\n 20267 19:22:57,961 --> 19:23:04,792 and let me go ahead and create a\n 20268 19:23:04,792 --> 19:23:09,682 In Search.html, I'm going to start\n 20269 19:23:11,032 --> 19:23:14,612 And in the body of this page, I'm\ngoing to introduce a form tag. 20270 19:23:14,611 --> 19:23:18,531 And in this form tag, I'm going\nto have a couple of inputs. 20271 19:23:18,532 --> 19:23:25,252 And the types of inputs are going to\n 20272 19:23:28,448 --> 19:23:30,531 And this isn't that\ninteresting yet, but let's see 20273 19:23:30,532 --> 19:23:32,422 what is happening in the page itself. 20274 19:23:32,422 --> 19:23:34,762 Let me go back to my directory listing. 20275 19:23:34,762 --> 19:23:36,922 Let me click on Search.html. 20276 19:23:36,922 --> 19:23:39,442 I seem to have the beginning\nof my own search engine. 20277 19:23:40,551 --> 19:23:43,201 It's just a text box\nand a submit button. 20278 19:23:43,202 --> 19:23:45,002 But let's finish my thoughts here. 20279 19:23:45,001 --> 19:23:49,911 So let's specifically give\nthis text box a name of Q 20280 19:23:49,911 --> 19:23:53,751 which, if you roll back to the late '90s\n 20281 19:23:53,751 --> 19:23:58,041 created Google.com, Q represented query,\n 20282 19:23:58,042 --> 19:24:01,822 So the name of this\ntext box shall be text 20283 19:24:01,822 --> 19:24:05,842 shall be Q. The form is\ngoing to use what method? 20284 19:24:05,842 --> 19:24:07,702 Technically it uses GET\nby default, but I'll 20285 19:24:07,702 --> 19:24:10,282 be explicit and say method\nequals quote unquote "get. 20286 19:24:10,282 --> 19:24:14,632 Stupidly, it's lowercase in HTML, even\n 20287 19:24:16,851 --> 19:24:21,491 The action of this form, specifically,\n 20288 19:24:21,491 --> 19:24:24,351 But we don't really have time\ntoday to implement Google itself. 20289 19:24:24,351 --> 19:24:28,941 So we're just going to send the\n 20290 19:24:28,941 --> 19:24:31,152 So I'm creating a form,\nthe action of which 20291 19:24:31,152 --> 19:24:35,711 is to send the data to Google's slash\n 20292 19:24:35,711 --> 19:24:41,021 It's going to send an input called Q,\n 20293 19:24:41,021 --> 19:24:44,081 Let me go back to the\nbrowser, reload the page. 20294 19:24:44,081 --> 19:24:48,911 Nothing seems to have changed yet,\n 20295 19:24:50,351 --> 19:24:53,952 Right now I'm in Search.html. 20296 19:24:53,952 --> 19:24:57,282 If I zoom out and search for\ncats now and click Submit 20297 19:24:57,282 --> 19:24:59,262 I'm whisked away to google.com. 20298 19:24:59,262 --> 19:25:02,952 But notice that the URL is\nparameterized, with those key value 20299 19:25:04,572 --> 19:25:06,432 And I get back a whole\nbunch of cat results. 20300 19:25:06,432 --> 19:25:08,771 And I can very easily now\nmake this a little prettier. 20301 19:25:08,771 --> 19:25:11,801 Right now, it's not ideal that like\n 20302 19:25:13,116 --> 19:25:15,491 And it's a little obnoxious\nthat autocomplete is enabled. 20303 19:25:15,491 --> 19:25:17,292 If I don't want to\nsearch for cats anymore 20304 19:25:17,292 --> 19:25:21,282 well, according to HTML's documentation,\n 20305 19:25:21,282 --> 19:25:25,362 Autocomplete equals off, to turn\noff autocomplete, auto focus 20306 19:25:25,361 --> 19:25:28,521 to automatically put the\ncursor inside of that text box. 20307 19:25:28,521 --> 19:25:32,292 If I want some explanatory text, I can\n 20308 19:25:33,521 --> 19:25:35,652 And now if I go back to\nthis page and reload 20309 19:25:35,652 --> 19:25:37,461 now it's a little more user-friendly. 20310 19:25:37,461 --> 19:25:39,641 You see query in kind of gray text. 20311 19:25:39,642 --> 19:25:41,472 The cursor is already\nthere and blinking. 20312 19:25:41,471 --> 19:25:43,001 I don't have to even move my cursor. 20313 19:25:43,001 --> 19:25:46,121 I can search for dogs now, and you\n 20314 19:25:46,122 --> 19:25:48,911 Hit enter to submit, and\nnow I'm searching for 20315 19:25:48,911 --> 19:25:51,831 there we go, adorable dogs, instead. 20316 19:25:52,762 --> 19:25:56,851 I've implemented the front end of\n 20317 19:25:56,851 --> 19:25:58,601 To implement the back\nend, we're obviously 20318 19:25:58,601 --> 19:26:01,812 going to need like a really big\n 20319 19:26:01,812 --> 19:26:05,484 We're going to need some code that like\n 20320 19:26:06,191 --> 19:26:08,091 We're going to need Python\nfor something like that. 20321 19:26:08,092 --> 19:26:09,912 And in fact, that's the direction\nwe're steering next week 20322 19:26:09,911 --> 19:26:11,262 when we implement that back end. 20323 19:26:11,262 --> 19:26:14,592 But today it's all about this front end. 20324 19:26:14,592 --> 19:26:20,682 Or any question, then, about forms,\n 20325 19:26:20,682 --> 19:26:24,680 transition to making things look\na little prettier, with CSS? 20326 19:26:24,679 --> 19:26:27,221 And then we'll end by making\nthings a little more functional 20327 19:26:31,941 --> 19:26:35,121 All right, so let's start to answer\n 20328 19:26:35,122 --> 19:26:40,232 came up, by making these pages a\n 20329 19:26:40,232 --> 19:26:45,872 Let's go ahead now and introduce to\n 20330 19:26:45,872 --> 19:26:48,562 Let me go ahead and create\na file called Home.html 20331 19:26:48,562 --> 19:26:51,292 as though I'm making a home\npage for the very first time. 20332 19:26:51,292 --> 19:26:53,961 And in this page, I'm going\nto give a title of Home. 20333 19:26:53,961 --> 19:26:55,891 And I'm just going to\nhave like three things. 20334 19:26:55,892 --> 19:27:00,292 First I'm going to have maybe\na paragraph of text up here 20335 19:27:00,292 --> 19:27:03,322 at the top, that says something\nwelcoming for my home page 20336 19:27:03,322 --> 19:27:06,741 like my name, John Harvard, for\n 20337 19:27:06,741 --> 19:27:10,072 Then in the middle of the page,\n 20338 19:27:10,072 --> 19:27:12,982 welcome to my home\npage exclamation point! 20339 19:27:12,982 --> 19:27:16,042 And at the bottom of the page, I'm\n 20340 19:27:16,042 --> 19:27:19,672 says something like copyright,\nthe copyright symbol, John 20341 19:27:19,672 --> 19:27:21,172 Harvard, or something like that. 20342 19:27:21,172 --> 19:27:25,432 All right, so it's like a web page\n 20343 19:27:26,721 --> 19:27:27,981 This isn't that interesting. 20344 19:27:27,982 --> 19:27:32,422 If I open this page called\nHome.html, let me go ahead 20345 19:27:32,422 --> 19:27:35,932 and create three quick paragraphs,\n 20346 19:27:35,932 --> 19:27:39,292 Inside the middle, I'm going to say\n 20347 19:27:40,232 --> 19:27:42,472 And at the bottom,\nwhoops, at the bottom 20348 19:27:42,471 --> 19:27:45,651 a little footer that says\nsomething like copyright 20349 19:27:45,652 --> 19:27:50,331 a little simple copyright\nsymbol, and John Harvard's name. 20350 19:27:50,331 --> 19:27:52,364 All right, now let me reload the page. 20351 19:27:53,032 --> 19:27:57,292 It's a very simple, very underwhelming\n 20352 19:27:57,292 --> 19:28:00,188 Let's start to now stylize\nthis in an interesting way 20353 19:28:00,188 --> 19:28:02,271 so that it's a little more\naesthetically pleasing. 20354 19:28:02,271 --> 19:28:04,251 First, these aren't really paragraphs. 20355 19:28:04,251 --> 19:28:08,104 They're sort of like areas of the page,\n 20356 19:28:08,104 --> 19:28:09,771 There's like the main part of my screen. 20357 19:28:09,771 --> 19:28:11,521 And then there's the\nfooter of my screen. 20358 19:28:11,521 --> 19:28:13,813 So paragraphs isn't quite\nright, if these aren't really 20359 19:28:15,152 --> 19:28:17,722 I might more properly call\nthem divs or divisions 20360 19:28:17,721 --> 19:28:21,831 of the page, which is a very commonly\n 20361 19:28:21,831 --> 19:28:24,121 this generic rectangular region to it. 20362 19:28:24,122 --> 19:28:28,192 It does not do anything aesthetically,\n 20363 19:28:28,191 --> 19:28:32,601 It just creates an invisible\nrectangular region, inside of which 20364 19:28:32,601 --> 19:28:34,342 you can start to style the text. 20365 19:28:34,342 --> 19:28:36,262 Or I can take this one step further. 20366 19:28:36,262 --> 19:28:40,461 There's some other tags in HTML,\n 20367 19:28:40,461 --> 19:28:42,869 have names that describe the\ntypes of your page, which 20368 19:28:42,869 --> 19:28:45,411 is all the more compelling these\ndays for accessibility, too 20369 19:28:45,411 --> 19:28:49,881 for screen readers, for search engines,\n 20370 19:28:49,881 --> 19:28:52,584 engine can realize that footer\nis probably a little fluffy. 20371 19:28:52,584 --> 19:28:54,292 The header might be\na little interesting. 20372 19:28:54,292 --> 19:28:57,202 The main part of the page\nis probably the juicy part 20373 19:28:57,202 --> 19:29:01,762 that I want users to be able to search\n 20374 19:29:01,762 --> 19:29:04,672 So let's start to stylize\nthis page somehow. 20375 19:29:04,672 --> 19:29:08,482 Let's introduce a style\nattribute in HTML 20376 19:29:08,482 --> 19:29:13,101 inside of which is going to be\ntext like this, font size colon 20377 19:29:13,101 --> 19:29:16,851 large, text align colon center. 20378 19:29:16,851 --> 19:29:20,182 On Main, I'm going to add a\nstyle attribute and say font size 20379 19:29:24,021 --> 19:29:27,741 And then on the footer, I'm going\nto say style equals font size 20380 19:29:34,232 --> 19:29:36,592 Well, in blue is the\nlanguage we promised 20381 19:29:36,592 --> 19:29:39,112 called CSS, for Cascading Style Sheets. 20382 19:29:39,111 --> 19:29:42,231 We're not really seeing the\nCascading Style Sheet of it yet. 20383 19:29:42,232 --> 19:29:46,192 But in blue here, notice is\nanother very common paradigm. 20384 19:29:46,191 --> 19:29:48,861 It's different syntax\nnow, but how would you 20385 19:29:48,861 --> 19:29:52,461 describe what you're\nlooking at here in blue? 20386 19:29:52,461 --> 19:29:56,902 This is another example of what\nkind of programming convention? 20387 19:29:57,789 --> 19:30:00,081 SPEAKER 1: Yeah, it's just\nmore key value pairs, right? 20388 19:30:00,081 --> 19:30:03,031 It'd be nice if the world standardized\n 20389 19:30:03,032 --> 19:30:07,260 because we've now seen equal signs\n 20390 19:30:07,801 --> 19:30:10,009 But it's just different\nlanguages, different choices. 20391 19:30:10,009 --> 19:30:13,121 The key here is font-size,\nthe value is large. 20392 19:30:13,122 --> 19:30:16,922 The other key is text-align,\nthe colon, the value is center. 20393 19:30:16,922 --> 19:30:20,642 The semicolon just separates\none key value pair from another. 20394 19:30:20,642 --> 19:30:24,482 Just like in the URL, the ampersand\ndid, in the context of HTTP. 20395 19:30:24,482 --> 19:30:27,392 The designers of CSS\nused semicolons instead. 20396 19:30:27,392 --> 19:30:30,149 Strictly speaking, this\nsemicolon isn't necessary. 20397 19:30:30,149 --> 19:30:32,732 I tend to include it just for\nsymmetry, but it doesn't matter 20398 19:30:32,732 --> 19:30:34,322 because there's nothing after that. 20399 19:30:34,322 --> 19:30:36,301 This is a bit of a weird example. 20400 19:30:36,301 --> 19:30:41,341 This is the co-mingling of\nCSS inside of JavaScript. 20401 19:30:41,342 --> 19:30:46,682 So as of now, you can use the CSS\n 20402 19:30:46,682 --> 19:30:49,441 in the value of a style attribute. 20403 19:30:49,441 --> 19:30:52,542 We did something a little\nsimilarly last two weeks 20404 19:30:52,542 --> 19:30:57,032 a week plus ago, when we included\nsome SQL inside of Python. 20405 19:30:57,032 --> 19:30:59,764 So again, languages can kind\nof cross barriers together. 20406 19:30:59,764 --> 19:31:01,682 But we're going to clean\nthis up, because this 20407 19:31:01,682 --> 19:31:04,599 is going to get messy quickly,\n 20408 19:31:04,599 --> 19:31:07,211 of Harvard's or Yale's, or the like. 20409 19:31:07,211 --> 19:31:09,372 So let's see what this looks like. 20410 19:31:09,372 --> 19:31:13,142 Let me go back to my browser\nwindow here, reload the page. 20411 19:31:13,142 --> 19:31:14,922 And it's not that different. 20412 19:31:14,922 --> 19:31:19,412 But it's indeed centered, and it's\n 20413 19:31:19,411 --> 19:31:20,732 And let me make one refinement. 20414 19:31:20,732 --> 19:31:22,771 The copyright symbol\nactually can be expressed 20415 19:31:22,771 --> 19:31:25,021 but there's no key on\nmy US keyboard here. 20416 19:31:25,021 --> 19:31:30,902 I can actually magically say\nampersand hash 169 semicolon 20417 19:31:30,902 --> 19:31:33,167 using what's called an HTML entity. 20418 19:31:33,167 --> 19:31:36,991 It turns out there are numeric\n 20419 19:31:36,991 --> 19:31:40,351 allow you to specify symbols that\n 20420 19:31:40,351 --> 19:31:42,361 but that don't exist on most keyboards. 20421 19:31:42,361 --> 19:31:46,051 If I reload the page now, now\nit's a proper copyright symbol. 20422 19:31:46,051 --> 19:31:50,221 So minor aesthetic, but it\nintroduces us to these HTML entities. 20423 19:31:50,221 --> 19:31:53,671 So even if you've never\nseen CSS before, you 20424 19:31:53,672 --> 19:31:56,522 can probably find something\nkind of dumb about what 20425 19:31:56,521 --> 19:31:58,201 I did here, like poor design. 20426 19:31:58,202 --> 19:32:02,582 It is correct, if my goal was small,\n 20427 19:32:02,581 --> 19:32:07,081 looks like a bad design,\nperhaps, even if you've never 20428 19:32:09,872 --> 19:32:12,122 SPEAKER 1: Yeah, I've used\nthe same style three times 20429 19:32:12,122 --> 19:32:14,942 like copy/paste, or typing the\nexact same thing again and again. 20430 19:32:14,941 --> 19:32:17,221 It has rarely been a good thing. 20431 19:32:17,221 --> 19:32:21,691 Well, here's where we can take\nadvantage of the design of CSS 20432 19:32:21,691 --> 19:32:24,121 because it supports what\nwe might call inheritance 20433 19:32:24,122 --> 19:32:29,461 whereby children inherit the properties,\n 20434 19:32:30,572 --> 19:32:32,471 And what that means is, I can do this. 20435 19:32:32,471 --> 19:32:34,231 Let me get rid of this text align. 20436 19:32:34,232 --> 19:32:36,032 Let me get rid of this text align. 20437 19:32:37,232 --> 19:32:40,142 I could get rid of the semicolon,\n 20438 19:32:40,142 --> 19:32:46,502 And let me add all of that style\n 20439 19:32:46,501 --> 19:32:51,511 so that it sort of cascades down to the\n 20440 19:32:52,331 --> 19:32:54,451 And let me close my quotes there, too. 20441 19:32:54,452 --> 19:32:58,381 Now, if I go back to my browser\nand hit reload, nothing changes. 20442 19:32:58,381 --> 19:33:00,241 But it's a little\nbetter designed, right? 20443 19:33:00,241 --> 19:33:03,601 Because if I want to change the text\n 20444 19:33:03,601 --> 19:33:06,491 I can now reload the page, and\nvoila, now it's over there. 20445 19:33:06,491 --> 19:33:08,952 I change it in one place, not\nin three different places. 20446 19:33:08,952 --> 19:33:11,822 So that would seem to be\nmarginally better design. 20447 19:33:11,822 --> 19:33:14,441 And could we do this\nany more differently? 20448 19:33:14,441 --> 19:33:20,251 Well, it's not that elegant that\n 20449 19:33:20,251 --> 19:33:22,621 This generally tends to\nbe bad practice, where 20450 19:33:22,622 --> 19:33:26,012 you co-mingle your HTML and your\n 20451 19:33:26,012 --> 19:33:28,622 might be really good at laying\nout the structure of web pages 20452 19:33:28,622 --> 19:33:31,411 and the content and the data, and you\n 20453 19:33:31,411 --> 19:33:32,822 or just not care about the aesthetics. 20454 19:33:32,822 --> 19:33:34,572 You might work with a\ndesigner, an artist 20455 19:33:34,572 --> 19:33:37,741 who's much better at all of\nthese fine tunings aesthetically. 20456 19:33:37,741 --> 19:33:41,921 Wouldn't it be nice if you could work\n 20457 19:33:41,922 --> 19:33:43,982 And you don't have to\nsomehow like literally 20458 19:33:43,982 --> 19:33:46,112 edit the same lines\nof code as each other. 20459 19:33:46,111 --> 19:33:50,131 Well, just like we can move\nstuff into header files in C 20460 19:33:50,131 --> 19:33:53,895 or packages in Python, we\ncan do the same in CSS. 20461 19:33:53,895 --> 19:33:55,812 So I'm actually going\nto go ahead and do this. 20462 19:33:55,812 --> 19:33:58,382 Let me get rid of all of\nthese style attributes 20463 19:33:58,381 --> 19:34:03,842 and let me now start to practice a\n 20464 19:34:05,221 --> 19:34:10,051 Let me instead move it into the\n 20465 19:34:11,611 --> 19:34:14,011 This is one of the rare\nexamples where there 20466 19:34:14,012 --> 19:34:16,862 are attributes that have the\nsame names of tags as vice versa. 20467 19:34:16,861 --> 19:34:19,441 It's not very common,\nbut this one does exist. 20468 19:34:19,441 --> 19:34:22,891 Here's a slightly different syntax for\n 20469 19:34:22,892 --> 19:34:26,912 If I want to apply CSS properties,\nthat is, key value pairs 20470 19:34:26,911 --> 19:34:31,201 to the header of the page, I say\n 20471 19:34:31,202 --> 19:34:38,101 and inside of those I say\nfont-size large, text-align center. 20472 19:34:38,101 --> 19:34:42,131 Then, if I want to apply some properties\n 20473 19:34:42,131 --> 19:34:47,072 I again do font-size, say, medium,\n 20474 19:34:47,072 --> 19:34:49,622 Then, lastly, on the\nfooter of the page, I 20475 19:34:49,622 --> 19:34:55,021 can assign some properties like\n 20476 19:34:57,532 --> 19:35:00,442 And I don't have to do\nanything more in my HTML. 20477 19:35:00,441 --> 19:35:03,871 It all just represents\nthe structure of my page. 20478 19:35:03,872 --> 19:35:06,313 But, because of this style\ntag in the head of the page 20479 19:35:06,313 --> 19:35:08,271 the browser knows in\nadvance that the moment it 20480 19:35:08,271 --> 19:35:10,851 encounters a header tag, a\nmain tag, or a footer tag 20481 19:35:10,851 --> 19:35:14,062 it should apply those\nproperties, those styles. 20482 19:35:14,062 --> 19:35:17,032 If I reload the page, other\nthan it being recentered now 20483 19:35:18,111 --> 19:35:21,171 All we're doing is sort of\n 20484 19:35:21,172 --> 19:35:24,322 But now everything's\nin the top of the file. 20485 19:35:24,322 --> 19:35:26,241 But there's still a bad design here. 20486 19:35:26,241 --> 19:35:30,202 What could I now do\nthat would be smarter? 20487 19:35:33,952 --> 19:35:35,932 SPEAKER 1: OK, create a\nnew file with just the CSS. 20488 19:35:36,411 --> 19:35:37,828 Let's go there in just one second. 20489 19:35:37,828 --> 19:35:40,161 But even as we're here,\nthere's still a redundancy 20490 19:35:40,161 --> 19:35:41,902 we can probably chip away at. 20491 19:35:41,902 --> 19:35:45,232 Yeah, get rid of the text-align center\n 20492 19:35:45,232 --> 19:35:47,601 doesn't seem necessary,\nand perhaps someone 20493 19:35:47,601 --> 19:35:53,991 else, if I get rid of text-align center,\n 20494 19:35:53,991 --> 19:35:56,991 in order to bring it back, but\n 20495 19:35:56,991 --> 19:35:59,822 And the page, if I scroll\ndown, looks like this, in HTML. 20496 19:36:01,282 --> 19:36:02,792 SPEAKER 1: Yeah, so the body tag. 20497 19:36:02,792 --> 19:36:04,711 So let me go ahead and say body. 20498 19:36:04,711 --> 19:36:07,282 And then in here, put text-align center. 20499 19:36:07,282 --> 19:36:10,282 And that, now, if I reload the\npage, has no visual effect 20500 19:36:10,282 --> 19:36:12,142 but it's just better\ndesign, because now I 20501 19:36:12,142 --> 19:36:14,242 factored out that kind of commonality. 20502 19:36:14,241 --> 19:36:16,521 And so, just to make clear\nwhat we've been doing here 20503 19:36:16,521 --> 19:36:19,491 these are all, again, CSS\nproperties, these key value pairs. 20504 19:36:19,491 --> 19:36:22,461 And there's different types\nof ways of using them. 20505 19:36:22,461 --> 19:36:24,797 And there's this whole taxonomy. 20506 19:36:24,797 --> 19:36:27,922 What we've been doing thus far are what\n 20507 19:36:27,922 --> 19:36:30,172 where the type is the name of a tag. 20508 19:36:30,172 --> 19:36:33,172 And so it turns out there's\nother ways, though, to do this. 20509 19:36:33,172 --> 19:36:35,211 And let's head in this direction. 20510 19:36:35,211 --> 19:36:38,782 Let's go ahead and maybe write\nour CSS slightly differently 20511 19:36:38,782 --> 19:36:40,282 because you know what would be nice. 20512 19:36:40,282 --> 19:36:44,512 I bet, after today, once I start\n 20513 19:36:44,512 --> 19:36:46,222 or John Harvard's home\npage, I might want 20514 19:36:46,221 --> 19:36:48,741 to have centered text on other pages. 20515 19:36:48,741 --> 19:36:52,012 And I might want to have large\n 20516 19:36:52,012 --> 19:36:55,012 It'd be nice if I could reuse\nthese properties again and again 20517 19:36:55,012 --> 19:36:57,292 and kind of create my\nown library, maybe even 20518 19:36:57,292 --> 19:36:59,812 ultimately putting it\nin a separate file. 20519 19:37:00,562 --> 19:37:04,042 Instead of explicitly applying\ntext-align center to the body 20520 19:37:04,042 --> 19:37:07,101 let me create a new\nnoun, or an adjective 20521 19:37:07,101 --> 19:37:09,441 rather, for myself, called centered. 20522 19:37:09,441 --> 19:37:12,021 It has to start with a\ndot, because what I'm doing 20523 19:37:12,021 --> 19:37:14,781 is inventing my own class, so to speak. 20524 19:37:14,782 --> 19:37:17,662 This has nothing to do with\nclasses in Java or Python. 20525 19:37:17,661 --> 19:37:20,271 Class here is this aesthetic feature. 20526 19:37:20,271 --> 19:37:22,971 And, actually, let me rename\nthese, to be dot large 20527 19:37:26,001 --> 19:37:29,931 What this is doing for me\nis it's inventing new words 20528 19:37:29,932 --> 19:37:33,232 well-named words, that I\ncan now use in this file 20529 19:37:33,232 --> 19:37:36,211 or potentially in other web\npages I make, as follows. 20530 19:37:36,211 --> 19:37:39,322 I can now say, if I want\nto center the whole body 20531 19:37:39,322 --> 19:37:41,631 I can say class equals centered. 20532 19:37:41,631 --> 19:37:45,171 On the header tag, I can\nsay class equals large. 20533 19:37:45,172 --> 19:37:48,222 On the main tag I can\nsay class equals medium. 20534 19:37:48,221 --> 19:37:50,631 On the footer tag, I can\nsay class equals small. 20535 19:37:50,631 --> 19:37:53,391 But let me take this one step further. 20536 19:37:53,392 --> 19:37:56,212 As you suggested, why\ndon't I go ahead now 20537 19:37:56,211 --> 19:37:59,452 and let me actually get rid\nof-- let me grab all of the CSS 20538 19:38:01,312 --> 19:38:08,362 Let me get rid of the style tag here,\n 20539 19:38:08,361 --> 19:38:13,432 and let me just save all of that same\n 20540 19:38:13,432 --> 19:38:15,622 nothing else, no HTML whatsoever. 20541 19:38:15,622 --> 19:38:18,801 But let me go back to my\nHome.html page, and this 20542 19:38:18,801 --> 19:38:21,951 is one of the most annoyingly named\n 20543 19:38:21,952 --> 19:38:30,052 mean what it does, Link HREF\nHome.css rel equals stylesheet. 20544 19:38:30,051 --> 19:38:33,292 So ideally we would have used the\n 20545 19:38:33,292 --> 19:38:36,112 but this is link in the\nsort of conceptual sense. 20546 19:38:36,111 --> 19:38:39,981 We're linking this file to this other\n 20547 19:38:39,982 --> 19:38:43,612 using this hyper-reference,\nHome.css, the relationship 20548 19:38:43,611 --> 19:38:46,281 of that file to this one\nis that of stylesheet. 20549 19:38:46,282 --> 19:38:48,472 A stylesheet is a file\ncontaining a whole bunch 20550 19:38:48,471 --> 19:38:52,581 of stylizations, a whole bunch\nof properties, as we just did. 20551 19:38:52,581 --> 19:38:54,621 So here, too, it's\nunderwhelming the effect. 20552 19:38:54,622 --> 19:38:57,002 If I reload the page, nothing changed. 20553 19:38:57,001 --> 19:39:01,881 But now, I not only have\na better design here 20554 19:39:01,881 --> 19:39:06,921 because I can now use those same classes\n 20555 19:39:06,922 --> 19:39:11,211 my third page, my fourth page, my bio,\n 20556 19:39:11,211 --> 19:39:15,021 making on my website here, I\ncan reuse those styles by just 20557 19:39:15,021 --> 19:39:19,251 including one line of code, instead of\n 20558 19:39:19,251 --> 19:39:21,841 stuff into file after file after file. 20559 19:39:21,842 --> 19:39:24,322 And heck, if the rest\nof the world is really 20560 19:39:24,322 --> 19:39:28,197 impressed by my centered class, and\n 20561 19:39:28,197 --> 19:39:31,072 I could bundle this up, let other\n 20562 19:39:31,072 --> 19:39:35,482 and I have my own library, my own CSS\n 20563 19:39:35,482 --> 19:39:37,881 Why should you ever invent\na centered class again 20564 19:39:37,881 --> 19:39:41,331 if I already did it for you,\nstupid and small as this one is. 20565 19:39:41,331 --> 19:39:43,191 But it would be nice\nnow to package this up 20566 19:39:43,191 --> 19:39:47,911 in a way that's usable\nby other people as well. 20567 19:39:47,911 --> 19:39:51,351 So this is perhaps the best\ndesign, when it comes to CSS. 20568 19:39:51,351 --> 19:39:56,601 Use classes where you can, use\n 20569 19:39:56,601 --> 19:40:00,971 but don't use the style attribute\n 20570 19:40:00,971 --> 19:40:06,531 starts to get messy quickly,\nespecially for large files. 20571 19:40:06,532 --> 19:40:08,617 All right, any questions, then, on this. 20572 19:40:11,361 --> 19:40:13,731 No, all right, so\nthat's class selectors. 20573 19:40:13,732 --> 19:40:16,131 When you specify dot\nsomething, that means 20574 19:40:16,131 --> 19:40:20,211 you're selecting all of the tags in the\n 20575 19:40:20,211 --> 19:40:21,661 and applying those properties. 20576 19:40:21,661 --> 19:40:23,828 So there's a couple of\nothers here, just to give you 20577 19:40:23,828 --> 19:40:25,432 a taste now of what's possible. 20578 19:40:25,432 --> 19:40:29,072 There's so much more that you can\n 20579 19:40:29,072 --> 19:40:33,501 Let me go ahead and open up a few\n 20580 19:40:33,501 --> 19:40:35,391 Let me go ahead and open up VS Code. 20581 19:40:35,392 --> 19:40:43,502 And let me go ahead and copy\nmy source eight directory. 20582 19:40:43,501 --> 19:40:47,041 Give me one second to grab the source\n 20583 19:40:47,042 --> 19:40:51,872 so that I can now go into\nmy browser, go into some 20584 19:40:51,872 --> 19:40:53,822 of the pre-made examples\nin source eight 20585 19:40:53,822 --> 19:40:57,461 and let me open up paragraphs one here. 20586 19:40:57,461 --> 19:41:00,691 So here's something,\nit's a little subtle. 20587 19:41:00,691 --> 19:41:04,292 But does anyone notice\nhow this is stylized? 20588 19:41:04,292 --> 19:41:07,142 This is just some generic\nlorem ipsum text again. 20589 19:41:07,142 --> 19:41:12,972 But what's noteworthy\nstylistically, a book might do this. 20590 19:41:14,432 --> 19:41:15,932 SPEAKER 1: Yeah, the first\nparagraph's a little bigger. 20591 19:41:16,322 --> 19:41:18,691 Who knows, it's just a stylistic\n 20592 19:41:18,691 --> 19:41:19,981 The first paragraph is bigger. 20593 19:41:21,072 --> 19:41:23,562 Well, we can actually explore\nthis in a couple of ways. 20594 19:41:23,562 --> 19:41:26,101 One, I can obviously go into\nVS Code and show you the code. 20595 19:41:26,101 --> 19:41:29,281 But, now, that we're using Chrome and\n 20596 19:41:30,572 --> 19:41:33,661 View developer, developer\ntools, and now notice 20597 19:41:33,661 --> 19:41:37,201 let me turn off the mobile feature,\n 20598 19:41:37,202 --> 19:41:39,752 to the bottom, just so\nthat it's fully wide. 20599 19:41:39,751 --> 19:41:41,641 We looked at the Network tab before. 20600 19:41:41,642 --> 19:41:44,072 We looked at the mobile button before. 20601 19:41:44,072 --> 19:41:45,971 Now let me click on Elements. 20602 19:41:45,971 --> 19:41:49,651 What's nice about the Elements tab\n 20603 19:41:49,652 --> 19:41:54,752 version of the web page's HTML,\n 20604 19:41:54,751 --> 19:41:58,831 for you, so that you can now henceforth\n 20605 19:41:58,831 --> 19:42:02,461 code, the HTML source code, of\nany web page on the internet. 20606 19:42:02,461 --> 19:42:05,221 Notice that my own web page\nhere, it's not that interesting. 20607 19:42:05,221 --> 19:42:08,072 There's a bunch of paragraph\ntags of lorem ipsum text. 20608 19:42:09,482 --> 19:42:12,452 The very first one, I gave an ID to. 20609 19:42:12,452 --> 19:42:14,851 This is something that you,\nas a web designer, can do. 20610 19:42:14,851 --> 19:42:18,452 You can give an ID attribute\nto any tag in a page 20611 19:42:18,452 --> 19:42:20,222 to give it a unique identifier. 20612 19:42:20,221 --> 19:42:22,951 The onus is on you, not to\nreuse the word, anywhere else. 20613 19:42:22,952 --> 19:42:24,932 If you reuse it, you've screwed up. 20614 19:42:26,402 --> 19:42:30,362 But I chose an ID of\nfirst, just so that I 20615 19:42:30,361 --> 19:42:34,271 have some way of referring to the\n 20616 19:42:34,271 --> 19:42:37,572 If I look in the head of the\npage, and the style tag here 20617 19:42:37,572 --> 19:42:40,292 notice that I have hash first. 20618 19:42:40,292 --> 19:42:43,411 So just as I use dot for\nclasses, the world of CSS 20619 19:42:43,411 --> 19:42:46,741 uses a hash symbol to\nrepresent IDs, unique IDs. 20620 19:42:46,741 --> 19:42:51,721 And what this is telling the browser,\n 20621 19:42:51,721 --> 19:42:57,182 F-I-R-S-T, without the hash,\napply font-size larger to it. 20622 19:42:57,182 --> 19:43:00,542 And that's why the first paragraph,\n 20623 19:43:01,952 --> 19:43:04,502 If I actually go into\nVS Code now, and let 20624 19:43:04,501 --> 19:43:06,121 me go into my source eight directory. 20625 19:43:06,122 --> 19:43:09,002 Let me open up Paragraphs1.html. 20626 19:43:10,622 --> 19:43:14,521 If I want to change the color of that\n 20627 19:43:14,521 --> 19:43:16,351 I can do color colon: green. 20628 19:43:16,351 --> 19:43:19,861 Let me close the developer\ntools, reload the page. 20629 19:43:19,861 --> 19:43:22,572 And now that page is green as well. 20630 19:43:22,572 --> 19:43:24,152 You don't have to just use words. 20631 19:43:27,271 --> 19:43:31,081 What was the hex code for green in RGB? 20632 19:43:31,081 --> 19:43:34,691 Like no red, lots of green, no blue. 20633 19:43:34,691 --> 19:43:38,881 So you could do 00 FF 00, using\na hash, which, coincidentally 20634 19:43:38,881 --> 19:43:41,131 is the same symbol, but it\nhas nothing to do with IDs. 20635 19:43:41,131 --> 19:43:44,792 This is just how Photoshop and\nweb pages represent colors. 20636 19:43:44,792 --> 19:43:46,050 Let's go back here and reload. 20637 19:43:46,050 --> 19:43:48,842 It's the same, although it's a\n 20638 19:43:50,251 --> 19:43:56,131 If I want to change it to red, that\n 20639 19:43:56,131 --> 19:43:58,081 and here I can go and reload. 20640 19:43:58,081 --> 19:43:59,932 Now it's first paragraph red. 20641 19:43:59,932 --> 19:44:01,682 This actually gets\npretty tedious quickly. 20642 19:44:01,682 --> 19:44:04,622 Like, if you're a web designer trying\n 20643 19:44:04,622 --> 19:44:06,961 it actually might be fun\nto tinker with the website 20644 19:44:06,961 --> 19:44:09,422 before you open up your editor\nand you start making changes 20645 19:44:11,771 --> 19:44:14,611 So notice what you can\ndo with developer tools 20646 19:44:14,611 --> 19:44:16,591 too, in Chrome and other browsers. 20647 19:44:16,592 --> 19:44:19,982 When I highlight over this\nparagraph, under the Elements tab 20648 19:44:19,982 --> 19:44:22,502 notice that, one, it\ngets highlighted in blue. 20649 19:44:22,501 --> 19:44:24,714 If I move my cursor, it\ndoesn't get highlighted. 20650 19:44:24,714 --> 19:44:26,131 If I move it, it gets highlighted. 20651 19:44:26,131 --> 19:44:29,641 So it's showing me what\nthat tag represents. 20652 19:44:29,642 --> 19:44:32,342 But notice over here on\nthe right, you can also 20653 19:44:32,342 --> 19:44:35,938 see all of the stylizations\nof that particular element. 20654 19:44:37,021 --> 19:44:40,561 The italicized ones here at the\n 20655 19:44:40,562 --> 19:44:44,772 That means this is what Google makes\n 20656 19:44:44,771 --> 19:44:48,432 But in non-italicized\nhere, you see hash first 20657 19:44:48,432 --> 19:44:50,441 which is my code, that I just changed. 20658 19:44:50,441 --> 19:44:55,981 And if I want to start tinkering with\n 20659 19:44:57,211 --> 19:45:02,101 But notice, if I go back to VS Code, I\n 20660 19:45:02,101 --> 19:45:04,141 This is now purely client side. 20661 19:45:05,221 --> 19:45:08,281 When I drew that picture\nearlier of the browser going 20662 19:45:08,282 --> 19:45:11,252 making a request to the cloud, the\n 20663 19:45:11,251 --> 19:45:14,491 coming back, the browser,\nyour Mac, your PC, your phone 20664 19:45:14,491 --> 19:45:18,641 has a copy of all the HTML and\nCSS, so you can change it here 20665 19:45:20,664 --> 19:45:22,831 And, for instance, you can\ndo this with any website. 20666 19:45:22,831 --> 19:45:29,011 Let's go, say, on a field trip\nhere, to how about Stanford.edu. 20667 19:45:29,012 --> 19:45:31,771 So here's Stanford's\nwebsite as of today. 20668 19:45:31,771 --> 19:45:34,141 Let's go ahead here\nand let's see, there's 20669 19:45:34,142 --> 19:45:36,642 their admissions page,\ncampus life, and so forth. 20670 19:45:36,642 --> 19:45:41,282 Let me go ahead and view developer\ntools on Stanford's page 20671 19:45:41,282 --> 19:45:45,242 developer tools, elements,\nyou can see all of their HTML. 20672 19:45:45,241 --> 19:45:48,221 And notice it's collapsed,\nso here is their header. 20673 19:45:48,221 --> 19:45:51,121 Here's their main part, and\nI'm using my keyboard shortcuts 20674 19:45:51,122 --> 19:45:54,203 to just open and close the tags,\nto dive in deeper and deeper. 20675 19:45:54,203 --> 19:45:56,161 Suppose you want to kind\nof mess with Stanford 20676 19:45:56,161 --> 19:45:58,861 you can actually like right\nclick on any element of a page 20677 19:45:58,861 --> 19:46:03,121 or control click, Inspect, and that's\n 20678 19:46:03,122 --> 19:46:07,062 to the tag in the Elements\ntab that shows you that link. 20679 19:46:07,062 --> 19:46:11,882 And notice, if I hover over this\n 20680 19:46:11,881 --> 19:46:13,921 as an unordered list from left to right. 20681 19:46:13,922 --> 19:46:16,255 But it doesn't have to be a\nbulleted list top to bottom. 20682 19:46:16,255 --> 19:46:20,202 They've used CSS to change it to be\n 20683 19:46:20,202 --> 19:46:22,862 research, health care,\ncampus admission, about. 20684 19:46:22,861 --> 19:46:25,691 Well, so much for\nadmission, that's gone. 20685 19:46:25,691 --> 19:46:29,911 So now, if I close developer tools,\n 20686 19:46:29,911 --> 19:46:32,971 But, of course, what have I really done. 20687 19:46:32,971 --> 19:46:35,404 I've just like mutated\nmy own local copy. 20688 19:46:35,404 --> 19:46:37,322 So this is not hacking,\neven though this might 20689 19:46:37,322 --> 19:46:38,947 be how they do it in TV and the movies. 20690 19:46:38,947 --> 19:46:40,691 It's still there if I reload the page. 20691 19:46:40,691 --> 19:46:44,601 But it's a wonderfully powerful way\n 20692 19:46:44,601 --> 19:46:46,351 different things\nstylistically, figure out 20693 19:46:46,351 --> 19:46:48,902 how you want to design\nsomething, and two, just learn 20694 19:46:50,381 --> 19:46:53,339 So, for instance, if I right click\n 20695 19:46:53,339 --> 19:46:56,831 go to inspect, and let\nme go to the LI tag. 20696 19:46:56,831 --> 19:47:00,211 Let me keep going up, up,\nup, up, up to the UL tag. 20697 19:47:00,211 --> 19:47:02,322 There's going to be a lot going on here. 20698 19:47:02,322 --> 19:47:06,271 But notice, they have applied\nall of these CSS properties 20699 19:47:08,702 --> 19:47:11,912 But notice, here, this is\nhow, it's something like this. 20700 19:47:11,911 --> 19:47:15,872 And we'd have to read more to learn\n 20701 19:47:15,872 --> 19:47:18,122 this is how they probably\ngot rid of the bullets. 20702 19:47:18,122 --> 19:47:19,682 And what you can do is just tinker. 20703 19:47:19,682 --> 19:47:21,389 Like, all right, well,\nwhat does this do? 20704 19:47:22,771 --> 19:47:26,101 All right, didn't really change\n 20705 19:47:26,922 --> 19:47:30,572 So now the margin is changed, the\npadding around it has changed. 20706 19:47:31,801 --> 19:47:34,081 We can just start turning\nthings on and off, just 20707 19:47:34,081 --> 19:47:35,792 to get a sense of how\nthe web page works. 20708 19:47:35,792 --> 19:47:37,831 I'm not really learning\nanything here so far. 20709 19:47:37,831 --> 19:47:43,801 Let me go to the LI here for, let's\n 20710 19:47:47,432 --> 19:47:50,229 So when there's a\ndisplay property in CSS 20711 19:47:50,229 --> 19:47:53,312 that's apparently effectively changing\n 20712 19:47:53,312 --> 19:47:56,491 if I turn that off, now Stanford's\nlinks all look like this. 20713 19:47:56,491 --> 19:47:57,721 And there are those bullets. 20714 19:47:57,721 --> 19:48:00,841 So again, just default styles,\nthat they've somehow overridden 20715 19:48:00,842 --> 19:48:03,782 and a good web designer\njust knows ultimately 20716 19:48:03,782 --> 19:48:06,500 how to do these kinds of things. 20717 19:48:06,500 --> 19:48:08,792 All right, how about a couple\nof final building blocks 20718 19:48:08,792 --> 19:48:09,991 before we'll take one more break. 20719 19:48:09,991 --> 19:48:12,616 And then we'll dive in with\nJavaScript to manipulate this stuff 20720 19:48:13,771 --> 19:48:17,432 Let me go ahead and open up,\nhow about Paragraphs2 here. 20721 19:48:17,432 --> 19:48:21,631 Let me close this tab, let me go\n 20722 19:48:21,631 --> 19:48:25,051 And this one looks\nthe same, except, when 20723 19:48:25,051 --> 19:48:27,661 I go ahead and inspect\nthis first paragraph 20724 19:48:27,661 --> 19:48:29,822 notice that I was able\nto get rid of the ID 20725 19:48:29,822 --> 19:48:32,202 somehow, which is just\nto say, there's many 20726 19:48:32,202 --> 19:48:34,562 many ways to solve\nproblems in HTML and CSS 20727 19:48:34,562 --> 19:48:36,422 just like there is in C and Python. 20728 19:48:36,422 --> 19:48:39,422 Let me look in the head and\nthe style of the page now. 20729 19:48:39,422 --> 19:48:45,812 This is what we might call\nanother type of selector 20730 19:48:45,812 --> 19:48:49,382 that allows us to specify\nthe paragraph tag 20731 19:48:49,381 --> 19:48:52,111 that itself happens to\nbe the first child only. 20732 19:48:52,111 --> 19:48:56,001 So you can apply CSS to a very\n 20733 19:48:56,001 --> 19:48:58,686 There's also syntax for last\nchild, if just the first one 20734 19:48:58,687 --> 19:49:00,312 is supposed to look a little different. 20735 19:49:00,312 --> 19:49:02,432 So, here, I've just\ngotten out of the business 20736 19:49:02,432 --> 19:49:05,282 of creating my own unique\nidentifier and, instead, I'm 20737 19:49:05,282 --> 19:49:08,142 using this type of selector as well. 20738 19:49:09,282 --> 19:49:14,072 Let me go into another example\nhere, called Link1.html 20739 19:49:14,072 --> 19:49:17,411 and here we have a very simple\n 20740 19:49:17,411 --> 19:49:19,411 But notice it's purple\nby default, because we've 20741 19:49:21,271 --> 19:49:24,331 Let's see if we can't maybe\nstylize Harvard's links 20742 19:49:25,812 --> 19:49:30,702 Let me go into Link version\n2, now, which looks like this. 20743 19:49:30,702 --> 19:49:33,012 And now Harvard is very red. 20744 19:49:33,971 --> 19:49:36,451 Well, let me right click\non it, click Inspect 20745 19:49:36,452 --> 19:49:37,952 and I can start to poke around. 20746 19:49:37,952 --> 19:49:40,381 It looks like my HTML is\nnot at all noteworthy. 20747 19:49:40,381 --> 19:49:43,952 It's just very simple HTML,\nanchor tag with an HREF. 20748 19:49:46,629 --> 19:49:48,461 And we can look at it\nin two different ways. 20749 19:49:48,461 --> 19:49:51,422 We can literally look at\nthe style, contents here 20750 19:49:51,422 --> 19:49:55,472 or we can look at Chrome's\npretty version of it, over here. 20751 19:49:55,471 --> 19:49:59,521 It looks like my style\nsheet, in the style tag 20752 19:49:59,521 --> 19:50:04,001 has changed the color to be red, and the\n 20753 19:50:04,001 --> 19:50:06,121 but it's another CSS property, to none. 20754 19:50:06,122 --> 19:50:08,911 Notice, if I turn that\noff, links on the internet 20755 19:50:08,911 --> 19:50:10,831 are underlined by\ndefault, which tends to be 20756 19:50:10,831 --> 19:50:13,591 good for familiarity, for\nvisibility, for accessibility. 20757 19:50:13,592 --> 19:50:18,152 But, if it's very obvious what\nis text and what is a link 20758 19:50:18,152 --> 19:50:21,062 maybe you change text\ndecoration to none. 20759 19:50:21,062 --> 19:50:25,142 But maybe, watch this, maybe the\nlink comes, the line comes back 20760 19:50:26,611 --> 19:50:29,161 Well, let's look at how\nI did this in style. 20761 19:50:29,161 --> 19:50:34,111 Notice that I have stylization, and I\n 20762 19:50:34,111 --> 19:50:36,572 here, as tends to be convention in CSS. 20763 19:50:36,572 --> 19:50:38,581 Color is red, text decoration is none. 20764 19:50:38,581 --> 19:50:42,961 But, whenever an anchor\ntag is hovered over 20765 19:50:42,961 --> 19:50:47,881 you can change the text decoration\n 20766 19:50:47,881 --> 19:50:51,251 So, again, just little ways of playing\n 20767 19:50:51,251 --> 19:50:52,831 once you understand\nthat, really, there's 20768 19:50:52,831 --> 19:50:54,091 just different types of selectors. 20769 19:50:54,092 --> 19:50:56,131 And you might have to remind\n 20770 19:50:57,361 --> 19:51:02,551 But it's just another way of scoping\n 20771 19:51:02,551 --> 19:51:06,451 Let's look at version 3 of this\n 20772 19:51:06,452 --> 19:51:11,702 If I go to Link3.html, maybe I\nwant to have Harvard links red 20773 19:51:14,471 --> 19:51:17,011 Well, let's right click,\nand click Inspect. 20774 19:51:17,012 --> 19:51:21,872 And here we might have two links,\nwith a couple of techniques 20775 19:51:21,872 --> 19:51:24,631 just to, again, emphasize, you can\n 20776 19:51:24,631 --> 19:51:30,061 I gave my Harvard link an ID of\n 20777 19:51:30,062 --> 19:51:34,802 In my CSS, if we go to the head\nof the page, I then did this. 20778 19:51:34,801 --> 19:51:37,441 The tag with the Harvard ID, a.k.a. 20779 19:51:37,441 --> 19:51:41,671 #Harvard, should be red,\n#Yale should be blue 20780 19:51:41,672 --> 19:51:45,572 and then any anchor tag should\nhave no text decoration 20781 19:51:45,572 --> 19:51:48,721 unless you hover over it, at which\n 20782 19:51:48,721 --> 19:51:52,231 And so, if I hover over Harvard,\nit's red underlined, Yale 20783 19:51:53,262 --> 19:51:56,262 If I want to get rid of the IDs, I\n 20784 19:51:57,542 --> 19:52:00,601 Same effect, but notice,\nI got rid of the IDs now. 20785 19:52:00,601 --> 19:52:02,281 How else can I express myself? 20786 19:52:02,282 --> 19:52:03,992 Well, let's look at the CSS here. 20787 19:52:03,991 --> 19:52:06,512 The anchor tag has no text\ndecoration by default 20788 19:52:06,512 --> 19:52:08,051 unless you're hovering over it. 20789 19:52:09,461 --> 19:52:11,881 This is what we would\ncall, on our list here 20790 19:52:11,881 --> 19:52:16,801 an attribute selector, where you\nselect tags using CSS notation 20791 19:52:18,372 --> 19:52:21,692 So this is saying, go ahead\nand find any anchor tag 20792 19:52:21,691 --> 19:52:26,341 who's HREF value happens to\nequal this URL, and make it red. 20793 19:52:26,342 --> 19:52:28,327 Do the same for Yale, and make it blue. 20794 19:52:28,327 --> 19:52:31,452 Now, this might not be ideal, because\n 20795 19:52:31,452 --> 19:52:33,601 these equal signs don't\nwork, because if it's 20796 19:52:33,601 --> 19:52:37,202 a different Harvard or different Yale\n 20797 19:52:37,202 --> 19:52:40,412 So let me look at version\n5 here, of Link.html. 20798 19:52:40,411 --> 19:52:43,951 Look at this style, and I\ndid this a little smarter. 20799 19:52:44,941 --> 19:52:46,951 And, again, just the kind\nof thing you look up. 20800 19:52:46,952 --> 19:52:54,542 Star equals means, change any anchor\n 20801 19:52:54,542 --> 19:52:59,551 Harvard.edu to red, and do the same\n 20802 19:52:59,551 --> 19:53:01,141 So star here connotes wildcard. 20803 19:53:01,142 --> 19:53:04,652 So search for Harvard.edu or\nYale.edu anywhere in the HREF 20804 19:53:04,652 --> 19:53:07,652 and if it's there, colorize the link. 20805 19:53:07,652 --> 19:53:11,732 And, again, we could do this all\n 20806 19:53:11,732 --> 19:53:15,012 to actually achieve the same kind\n 20807 19:53:15,012 --> 19:53:17,115 And as projects just\nget larger and larger 20808 19:53:17,115 --> 19:53:19,032 you just have more and\nmore decisions to make. 20809 19:53:19,032 --> 19:53:21,752 And so you have certain\nconventions you start to adopt. 20810 19:53:21,751 --> 19:53:25,171 And, indeed, if I may,\nyou have the introduction 20811 19:53:25,172 --> 19:53:28,142 of what are called\nframeworks, ultimately. 20812 19:53:28,142 --> 19:53:30,512 If you're a full-time\nweb developer, or you're 20813 19:53:30,512 --> 19:53:33,889 working for a company doing the same,\n 20814 19:53:34,682 --> 19:53:38,372 For instance, the company might say,\n 20815 19:53:38,372 --> 19:53:41,222 Or always use attribute\nselectors, or don't use this. 20816 19:53:41,221 --> 19:53:44,221 And it wouldn't be necessarily\nas draconian as that. 20817 19:53:44,221 --> 19:53:46,471 But they might have a\nstyle guide of sorts. 20818 19:53:46,471 --> 19:53:49,441 But, what many people, and\nmany companies, do nowadays 20819 19:53:49,441 --> 19:53:53,402 is they do not come up with all\nof their own CSS properties. 20820 19:53:53,402 --> 19:53:57,241 They start with something off the shelf,\n 20821 19:53:57,241 --> 19:54:00,781 source framework, that just gives\n 20822 19:54:00,782 --> 19:54:04,172 for free, just by using\na third party library. 20823 19:54:04,172 --> 19:54:05,912 And one of the most\npopular ones nowadays 20824 19:54:05,911 --> 19:54:07,861 is something called\nBootstrap, that CS50 uses 20825 19:54:07,861 --> 19:54:11,161 on all of its websites,\nsuper-popular in industry as well. 20826 19:54:11,161 --> 19:54:17,161 It's at getbootstrap.com, and this\n 20827 19:54:17,161 --> 19:54:20,611 a website that documents\nthe library that they offer. 20828 19:54:20,611 --> 19:54:24,751 And there's so much documentation here,\n 20829 19:54:26,521 --> 19:54:30,122 It just gives you, out of the\nbox, the CSS with which you 20830 19:54:31,322 --> 19:54:33,392 If you've ever noticed\non CS50's website 20831 19:54:33,392 --> 19:54:36,002 little colorful warnings at\nthe top of the page, or call 20832 19:54:36,001 --> 19:54:37,861 outs, to draw your attention to things. 20833 19:54:38,851 --> 19:54:41,491 It's probably a paragraph\ntag or a div tag 20834 19:54:41,491 --> 19:54:43,111 and maybe we changed the font color. 20835 19:54:43,111 --> 19:54:44,731 We changed the background color. 20836 19:54:44,732 --> 19:54:47,282 Or it's a lot of stuff we could\nabsolutely do from scratch 20837 19:54:47,282 --> 19:54:49,747 but, you know what,\nwhy would we reinvent 20838 19:54:49,747 --> 19:54:51,372 the wheel if we can just use Bootstrap. 20839 19:54:51,372 --> 19:54:53,252 So, for instance, let\nme just scroll down. 20840 19:54:53,251 --> 19:54:57,851 If you've ever seen on CS50's website\n 20841 19:54:57,851 --> 19:55:00,542 let me just zoom in on this. 20842 19:55:00,542 --> 19:55:03,271 We are just using HTML like this. 20843 19:55:03,271 --> 19:55:06,241 We're using a div tag, which,\nagain, is an invisible division 20844 19:55:06,241 --> 19:55:07,812 a rectangular region of the page. 20845 19:55:07,812 --> 19:55:12,812 But we're using classes called alert\n 20846 19:55:12,812 --> 19:55:17,461 Those are classes that the\nfolks at Bootstrap invented. 20847 19:55:17,461 --> 19:55:19,982 They associated certain\ntext colors and background 20848 19:55:19,982 --> 19:55:23,042 colors and padding and margin\nand like other aesthetics with 20849 19:55:23,042 --> 19:55:25,411 so all we have to do\nis use those classes. 20850 19:55:25,411 --> 19:55:28,441 Role equals alert, just makes clear\n 20851 19:55:28,441 --> 19:55:30,512 is an alert, that should\nprobably be recited 20852 19:55:30,512 --> 19:55:33,601 and whatever's in between\nthe open tag and close tag 20853 19:55:33,601 --> 19:55:35,312 is what the human would see. 20854 19:55:35,312 --> 19:55:37,262 How do you use something like Bootstrap? 20855 19:55:37,262 --> 19:55:39,032 Well, you just read the documentation. 20856 19:55:39,032 --> 19:55:44,272 Under Getting Started, there is a\n 20857 19:55:45,232 --> 19:55:49,312 So in Table.html, we had code like this. 20858 19:55:49,312 --> 19:55:52,182 Let me actually read Bootstrap's\ndocumentation really fast. 20859 19:55:55,542 --> 19:55:57,551 I'm going to put this\ninto the head of my page. 20860 19:55:57,551 --> 19:55:59,981 And it's quite long, but\nnotice, it's a link tag 20861 19:55:59,982 --> 19:56:03,851 which I used earlier for my\nown CSS file, the HREF of which 20862 19:56:03,851 --> 19:56:06,491 is this CDN link, content\ndelivery network, that's 20863 19:56:06,491 --> 19:56:09,732 referring to a specific version of\n 20864 19:56:09,732 --> 19:56:13,752 And the file that I'm including\nis called Bootstrap.min.css. 20865 19:56:13,751 --> 19:56:17,291 This is an actual file I\ncan visit with my browser. 20866 19:56:17,292 --> 19:56:20,622 If I open this in a separate\ntab, this is the CSS 20867 19:56:20,622 --> 19:56:23,532 that Bootstrap has made\nfreely available to us. 20868 19:56:25,221 --> 19:56:27,221 That's because it's been\nminimized, just to not 20869 19:56:27,221 --> 19:56:29,771 waste space by adding lots\nof white space and comments. 20870 19:56:29,771 --> 19:56:33,221 But this contains a whole lot,\nhundreds, of CSS properties 20871 19:56:33,221 --> 19:56:36,851 that we can reuse, thanks to\nclasses that they invented. 20872 19:56:36,851 --> 19:56:40,285 If I want to use some JavaScript\n 20873 19:56:40,285 --> 19:56:41,952 But we'll come back to that before long. 20874 19:56:41,952 --> 19:56:45,522 Let me now just make a couple\nof tweaks to this table. 20875 19:56:45,521 --> 19:56:48,551 If I go into my browser\nfrom before, this 20876 19:56:48,551 --> 19:56:51,461 is what it looked like previously,\nwhere name and number were 20877 19:56:51,461 --> 19:56:54,086 bold, but centered, and then\nCarter and David were on the left 20878 19:56:54,086 --> 19:56:55,504 and the numbers were to the right. 20879 19:56:56,202 --> 19:56:59,662 It's not that pretty, but it'd be nice\n 20880 19:56:59,661 --> 19:57:03,101 So if we add Bootstrap into it,\nnotice one thing happens first 20881 19:57:04,872 --> 19:57:08,262 No longer are Chrome's\ndefault styles used. 20882 19:57:08,262 --> 19:57:10,482 Now Bootstrap's default\nstyles are used, which 20883 19:57:10,482 --> 19:57:14,112 is a way of enforcing similarity\nacross Chrome, Edge, Firefox 20884 19:57:15,461 --> 19:57:18,042 Notice it went from a\nserif font to a sans serif 20885 19:57:18,042 --> 19:57:19,792 font, and something cleaner like this. 20886 19:57:19,792 --> 19:57:24,221 It still looks pretty ugly, but let\n 20887 19:57:24,221 --> 19:57:30,311 Let me go under their\ncontent tab, for tables. 20888 19:57:30,312 --> 19:57:32,532 And if I just kind of\nstart skimming this 20889 19:57:32,532 --> 19:57:34,392 these are some good\nlooking tables, right? 20890 19:57:34,392 --> 19:57:38,442 Like, there's some underlining\nhere, some bolder font. 20891 19:57:39,672 --> 19:57:41,790 If I keep going, ooh,\nthat's getting pretty, too 20892 19:57:41,789 --> 19:57:44,831 if I want to have a colorful table,\n 20893 19:57:44,831 --> 19:57:47,981 out myself if I want\nsome dark mode here 20894 19:57:47,982 --> 19:57:51,472 if I want to have alternating\nhighlights, and so forth. 20895 19:57:51,471 --> 19:57:54,491 There's so many different stylizations\n 20896 19:57:54,491 --> 19:57:58,211 But I care about making a phone book,\n 20897 19:57:58,211 --> 19:58:02,982 So if I read the documentation closely,\n 20898 19:58:02,982 --> 19:58:06,881 is add Bootstrap's table\nclass to my table tag 20899 19:58:06,881 --> 19:58:11,592 and watch with a simple reload, what\n 20900 19:58:13,331 --> 19:58:16,391 Might not be what you want, but, my\n 20901 19:58:16,392 --> 19:58:18,262 I just really prettied things up. 20902 19:58:18,262 --> 19:58:21,112 And so here, then, is the value of\n 20903 19:58:21,111 --> 19:58:27,311 It allows you to actually\ncreate much prettier, much more 20904 19:58:27,312 --> 19:58:32,232 user-friendly websites than you might\n 20905 19:58:34,211 --> 19:58:37,842 In fact, let's iterate one\nmore time on one other example 20906 19:58:37,842 --> 19:58:39,862 before we introduce a bit of that code. 20907 19:58:39,861 --> 19:58:44,691 Let me go ahead and open\nup Search.html from before 20908 19:58:44,691 --> 19:58:49,331 which, recall, looks like this,\nand Search.html on my browser 20909 19:58:49,331 --> 19:58:52,511 was this very simple Google search. 20910 19:58:52,512 --> 19:58:56,801 And suppose I want to reinvent\nGoogle.com's UI a bit more. 20911 19:58:56,801 --> 19:58:59,711 Here's a screenshot of\nGoogle.com on a typical day. 20912 19:58:59,711 --> 19:59:03,581 It's got an about link, a store\n 20913 19:59:05,360 --> 19:59:07,152 It's not appearing well\non the screen here 20914 19:59:07,152 --> 19:59:09,851 but there's a big text box in\nthe middle, and then two buttons 20915 19:59:09,851 --> 19:59:12,131 Google search, and I'm feeling lucky. 20916 19:59:12,131 --> 19:59:15,851 Well, could I maybe go about\nimplementing this UI myself 20917 19:59:15,851 --> 19:59:19,872 using some HTML, some CSS,\nand maybe Bootstrap's help 20918 19:59:19,872 --> 19:59:23,110 just so I don't have to figure out\n 20919 19:59:23,110 --> 19:59:24,402 Well, here's my starting point. 20920 19:59:24,402 --> 19:59:29,631 In Search.html, let's go and add\n 20921 19:59:29,631 --> 19:59:34,342 so that we have access to all of\n 20922 19:59:34,342 --> 19:59:37,732 And let me go ahead and\nfigure out how to do this. 20923 19:59:37,732 --> 19:59:43,392 Well, just like Stanford's site had\n 20924 19:59:43,392 --> 19:59:46,362 but they changed it from being a\n 20925 19:59:46,361 --> 19:59:48,322 I bet I can do something\nlike this myself. 20926 19:59:48,322 --> 19:59:50,562 So let me go into the body\nof my page and, first 20927 19:59:50,562 --> 19:59:54,851 based on Bootstrap's documentation,\nlet me add a div called 20928 19:59:54,851 --> 19:59:57,641 a div with a class of container fluid. 20929 19:59:57,642 --> 19:59:59,832 Container fluid is\njust a class that comes 20930 19:59:59,831 --> 20:00:03,491 with Bootstrap that says, make\nyour web page fluid, that is 20931 20:00:04,961 --> 20:00:07,032 So that way it's going to resize nicely. 20932 20:00:07,032 --> 20:00:09,392 I'm going to go ahead and\nfix my indentation here. 20933 20:00:09,392 --> 20:00:11,142 If you haven't discovered\nthis yet, if you 20934 20:00:11,142 --> 20:00:13,032 highlight multiple lines\nin VS Code, you can 20935 20:00:13,032 --> 20:00:15,082 hit Tab and indent them all at once. 20936 20:00:15,081 --> 20:00:17,451 So now, I have all of\nthat inside of this div. 20937 20:00:17,452 --> 20:00:23,052 Now, just like in Stanford's site, let's\n 20938 20:00:23,051 --> 20:00:32,561 an LI, called with a class of NAV item,\n 20939 20:00:32,562 --> 20:00:42,702 let me go ahead and say, A\nHREF=https://about.google 20940 20:00:42,702 --> 20:00:44,972 which is the real URL\nof Google's about page. 20941 20:00:44,971 --> 20:00:46,771 And I'll put the about text in there. 20942 20:00:46,771 --> 20:00:50,911 Then I'm going to close my LI tag\n 20943 20:00:50,911 --> 20:00:52,381 because I'm using Bootstrap. 20944 20:00:52,381 --> 20:00:54,391 Bootstrap's documentation,\nif I read it closely 20945 20:00:54,392 --> 20:00:59,642 says to add a class to your links,\n 20946 20:00:59,642 --> 20:01:04,712 to make it dark, like black or dark\n 20947 20:01:04,711 --> 20:01:08,551 All right, so I think I\nhave now an about link 20948 20:01:08,551 --> 20:01:10,661 in a navigation part of my screen. 20949 20:01:10,661 --> 20:01:14,161 Let me go ahead and\nsave this and reload. 20950 20:01:14,161 --> 20:01:16,351 All right, so not exactly what I wanted. 20951 20:01:16,351 --> 20:01:19,741 It's a bulleted list, still, so\nI need to override this somehow. 20952 20:01:19,741 --> 20:01:22,831 Let me read Bootstrap's\ndocumentation a little more clearly. 20953 20:01:22,831 --> 20:01:25,021 And let me pretend to do\nthat, for time's sake. 20954 20:01:25,021 --> 20:01:28,682 If I go under content, oops,\nif I go under components 20955 20:01:28,682 --> 20:01:32,801 and I go to Navs and\nTabs, long story short 20956 20:01:32,801 --> 20:01:35,701 if you want to create a pretty menu\n 20957 20:01:35,702 --> 20:01:37,802 from the left to the\nright, just like Stanford 20958 20:01:37,801 --> 20:01:40,001 I essentially need HTML like this. 20959 20:01:40,001 --> 20:01:42,271 And this is subtle, but\nI left off this class. 20960 20:01:42,271 --> 20:01:45,902 I should have added a\nclass called NAV on my UL. 20961 20:01:46,831 --> 20:01:49,322 Let me go in here and\nsay add class equals 20962 20:01:49,322 --> 20:01:53,702 NAV, and then again, this class\nNAV item, Bootstrap told me to 20963 20:01:53,702 --> 20:01:56,461 NAV link text dark,\nBootstrap told me to. 20964 20:01:56,461 --> 20:02:02,521 Let me go back to my page here,\n 20965 20:02:02,521 --> 20:02:05,161 But at least the About link is\nin the top left hand corner 20966 20:02:05,161 --> 20:02:07,891 just like it should be\nin the real google.com. 20967 20:02:07,892 --> 20:02:10,142 Now let me whip up a couple\nof more links real fast. 20968 20:02:10,142 --> 20:02:13,472 Let me go and do a little\ncopy/paste, though I bet next week 20969 20:02:13,471 --> 20:02:15,811 we can avoid this kind of copy/paste. 20970 20:02:15,812 --> 20:02:20,252 Let me change this link\nto be Store.google.com. 20971 20:02:22,471 --> 20:02:26,941 Let me go ahead and create\nanother one here for Gmail. 20972 20:02:26,941 --> 20:02:31,682 So this one's going to go\nto, officially, how about 20973 20:02:31,682 --> 20:02:35,642 technically it's www.google.com/gmail. 20974 20:02:37,172 --> 20:02:39,332 And let me grab one more of these. 20975 20:02:39,331 --> 20:02:42,691 And for Google Images, and I'm\ngoing to paste this, whoops 20976 20:02:44,042 --> 20:02:45,631 I'm going to put this here, too. 20977 20:02:45,631 --> 20:02:53,251 This is going to be images, and\nthat URL is IMG.hp, is the URL. 20978 20:02:53,251 --> 20:02:56,281 All right, let me go ahead\nand reload the browser page. 20979 20:02:56,282 --> 20:02:57,842 Now it's coming along, right? 20980 20:02:57,842 --> 20:02:59,491 About, store, Gmail, images. 20981 20:02:59,491 --> 20:03:01,057 It's not quite what I want. 20982 20:03:01,057 --> 20:03:03,182 So I'd have to read the\ndocumentation to figure out 20983 20:03:03,182 --> 20:03:07,395 how to maybe nudge one of these\n 20984 20:03:07,395 --> 20:03:09,062 And there's a couple of ways to do this. 20985 20:03:09,062 --> 20:03:13,351 But one way is if I want Gmail to move\n 20986 20:03:13,351 --> 20:03:23,161 else, I can say that add some margin to\n 20987 20:03:23,161 --> 20:03:27,721 This is in Bootstrap's documentation, a\n 20988 20:03:27,721 --> 20:03:29,941 just automatically\nshove everything apart. 20989 20:03:29,941 --> 20:03:34,441 And now, if I reload the page\n 20990 20:03:35,461 --> 20:03:37,336 All right, so now we're\nkind of moving along. 20991 20:03:37,336 --> 20:03:40,422 Let me go ahead and add the\nbig blue button to sign in. 20992 20:03:40,422 --> 20:03:45,422 So here with sign in, let me go\n 20993 20:03:45,422 --> 20:03:49,202 so let's go ahead and do one\nmore LI, class equals NAV item. 20994 20:03:49,202 --> 20:03:52,442 And then, inside of this LI\ntag, what am I going to do? 20995 20:03:52,441 --> 20:03:58,111 Turns out there is a class that can turn\n 20996 20:03:58,111 --> 20:04:01,411 for button, and then button\nprimary, makes it blue 20997 20:04:01,411 --> 20:04:04,554 the HREF for this one is going\nto be https://accounts.goo 20998 20:04:04,554 --> 20:04:08,491 gle.com/service/login, which is\n 20999 20:04:09,631 --> 20:04:11,941 The role of this link is that of button. 21000 20:04:11,941 --> 20:04:15,241 And then sign in, is going\nto be the text on it. 21001 20:04:15,241 --> 20:04:18,744 If I now reload the page, now\nwe're getting even closer 21002 20:04:18,744 --> 20:04:20,161 although it looks a little stupid. 21003 20:04:20,161 --> 20:04:22,921 Notice that sign in is way\nin the top right hand corner 21004 20:04:22,922 --> 20:04:26,492 whereas the real google.com has\n 21005 20:04:26,491 --> 20:04:28,021 OK, that's an easy fix, too. 21006 20:04:28,021 --> 20:04:30,061 Let me go back into my HTML here. 21007 20:04:31,849 --> 20:04:33,182 This, too, is a Bootstrap thing. 21008 20:04:33,182 --> 20:04:35,702 They have a class called m-something. 21009 20:04:35,702 --> 20:04:38,461 The something is a\nnumber from like 1 to 5 21010 20:04:38,461 --> 20:04:41,982 I believe, that adds just\nsome amount of white space. 21011 20:04:41,982 --> 20:04:45,672 So if I reload now, OK,\nit's just a little prettier. 21012 20:04:47,191 --> 20:04:50,671 Just to demonstrate how I can\ntake this home, let me go ahead 21013 20:04:50,672 --> 20:04:55,232 and open up my premade\nversion of this, whereby 21014 20:04:55,232 --> 20:04:58,862 I added to this some final flourishes. 21015 20:04:58,861 --> 20:05:02,281 If I go to Search2.html, I\ndecided to replace their logo 21016 20:05:02,282 --> 20:05:05,582 with just this out of a\ncat, and notice that I 21017 20:05:05,581 --> 20:05:07,664 re-implemented essentially google.com. 21018 20:05:07,664 --> 20:05:10,081 Here's a text box, here's two\nbuttons, even though they're 21019 20:05:10,081 --> 20:05:11,641 a little washed out on the screen. 21020 20:05:11,642 --> 20:05:14,882 I even figured out how to get dots\n 21021 20:05:14,881 --> 20:05:18,421 And if we view source, you can see\n 21022 20:05:18,422 --> 20:05:25,322 If I go to view developer tools, and I\n 21023 20:05:25,322 --> 20:05:31,081 and I go into this div, you'll see\n 21024 20:05:31,081 --> 20:05:35,531 And I added some classes there to make\n 21025 20:05:35,532 --> 20:05:38,652 If I go into the form tag, this\nis the same form tag as before. 21026 20:05:38,652 --> 20:05:40,952 But, notice, I used\nbutton tags this time 21027 20:05:40,952 --> 20:05:43,672 with button and button light classes. 21028 20:05:43,672 --> 20:05:45,961 And then I stylized\nthem in a certain way. 21029 20:05:45,961 --> 20:05:49,221 And so in the end result, if I want\n 21030 20:05:49,221 --> 20:05:51,981 and click Google search,\nvoila, I've implemented 21031 20:05:51,982 --> 20:05:54,982 something that's pretty\ndarn close to Google.com 21032 20:05:54,982 --> 20:05:57,952 without even touching raw CSS myself. 21033 20:05:57,952 --> 20:06:00,502 And now here's the value,\nthen, of a framework. 21034 20:06:00,501 --> 20:06:02,811 You can just start to use\noff the shelf functionality 21035 20:06:02,812 --> 20:06:04,622 that someone else created for you. 21036 20:06:04,622 --> 20:06:07,101 But if you want to make\nrefinements, you don't really 21037 20:06:07,101 --> 20:06:09,831 like the shade of blue that\nBootstrap chose, or the gray button 21038 20:06:09,831 --> 20:06:11,748 or you want to curve\nthings a bit more, that's 21039 20:06:11,748 --> 20:06:14,122 where you can create\nyour own CSS file, and do 21040 20:06:14,122 --> 20:06:16,372 the last mile, sort\nof fine tuning things. 21041 20:06:16,372 --> 20:06:17,961 And that tends to be best practice. 21042 20:06:17,961 --> 20:06:21,051 Stand on the shoulders of others as\n 21043 20:06:21,051 --> 20:06:23,551 And then if you really don't\nlike what the library is doing 21044 20:06:23,551 --> 20:06:26,301 then use your own skills and\nunderstanding of HTML and CSS 21045 20:06:26,301 --> 20:06:29,551 to refine things a bit further. 21046 20:06:29,551 --> 20:06:33,262 But still, after all of that, all of\n 21047 20:06:33,262 --> 20:06:35,752 are still static, other\nthan the Google one 21048 20:06:35,751 --> 20:06:37,671 which searches on the real Google.com. 21049 20:06:37,672 --> 20:06:39,532 Let's take a final 5\nminute break and we'll 21050 20:06:39,532 --> 20:06:43,282 give you a sense of what we can next\n 21051 20:06:44,822 --> 20:06:48,381 All right, so I think\nit's fair to say, we're 21052 20:06:48,381 --> 20:06:50,819 about to see our very last language. 21053 20:06:50,820 --> 20:06:52,612 Next week and final\nprojects are ultimately 21054 20:06:52,611 --> 20:06:55,101 going to be about\nsynthesizing so many of these. 21055 20:06:55,101 --> 20:06:58,042 Thankfully, this language called\nJavaScript is quite similar 21056 20:06:58,042 --> 20:07:00,229 syntactically to both C and Python. 21057 20:07:00,229 --> 20:07:03,021 And, indeed, if you can imagine\n 21058 20:07:03,021 --> 20:07:05,391 you can probably do it in\nsome form in JavaScript. 21059 20:07:05,392 --> 20:07:07,742 The most fundamental\ndifference today, though 21060 20:07:07,741 --> 20:07:11,122 is that when you have written C\ncode and Python code thus far 21061 20:07:11,122 --> 20:07:12,379 you've done it on the server. 21062 20:07:12,379 --> 20:07:14,461 You've done it in the\nterminal window environment. 21063 20:07:14,461 --> 20:07:17,812 And when you run the code, it's\n 21064 20:07:17,812 --> 20:07:20,122 The difference now today\nwith JavaScript is 21065 20:07:20,122 --> 20:07:23,362 even though you're going to write\nit in the cloud using VS Code 21066 20:07:23,361 --> 20:07:27,981 recall that, when a browser gets\nthe page containing this code 21067 20:07:27,982 --> 20:07:32,012 it's going to get a copy of the HTML,\n 21068 20:07:32,012 --> 20:07:37,012 So JavaScript, that we see today, is\n 21069 20:07:37,012 --> 20:07:40,732 on users' own Macs, PCs, and\nphones, not in the server. 21070 20:07:40,732 --> 20:07:44,062 JavaScript can be used on the server,\n 21071 20:07:44,062 --> 20:07:47,752 It's an alternative to Python or\n 21072 20:07:47,751 --> 20:07:51,361 We are using it today client\nside, which is a key difference. 21073 20:07:51,361 --> 20:07:53,932 So in Scratch, let's\ndo this one last time. 21074 20:07:53,932 --> 20:07:56,991 If you wanted to create a variable\n 21075 20:07:56,991 --> 20:07:59,331 In JavaScript, it's\ngoing to look like this. 21076 20:07:59,331 --> 20:08:02,241 You don't specify the type,\nbut you do use the keyword let 21077 20:08:02,241 --> 20:08:07,581 and there's a few others as well, that\n 21078 20:08:07,581 --> 20:08:11,631 If you want to increment that\nvariable by one, you in JavaScript 21079 20:08:11,631 --> 20:08:14,932 could say something like,\ncounter equals counter plus 1 21080 20:08:14,932 --> 20:08:17,512 or you can do it more\nsuccinctly, with plus equals 21081 20:08:17,512 --> 20:08:20,092 or the plus plus is back in JavaScript. 21082 20:08:20,092 --> 20:08:23,122 You can now say counter\nplus plus semicolon again. 21083 20:08:23,122 --> 20:08:26,122 In Scratch, if you wanted to\ndo a conditional like this 21084 20:08:26,122 --> 20:08:30,741 asking if x less than y, it looks\n 21085 20:08:32,361 --> 20:08:35,691 The curly braces here are back, if you\n 21086 20:08:35,691 --> 20:08:39,951 But, syntactically, it's pretty much\n 21087 20:08:39,952 --> 20:08:42,502 and even for it's else if else. 21088 20:08:42,501 --> 20:08:45,621 Unlike Python, it's two\nwords again, else if. 21089 20:08:45,622 --> 20:08:49,347 So quite, quite like C,\nnothing new beyond that. 21090 20:08:49,346 --> 20:08:52,221 If you want to do something forever\n 21091 20:08:52,221 --> 20:08:55,731 In JavaScript, you can do it a few\n 21092 20:08:57,172 --> 20:09:00,652 In JavaScript, Booleans are\nlowercase again, just like in C. 21093 20:09:02,073 --> 20:09:04,281 If you want to do something\na finite number of times 21094 20:09:04,282 --> 20:09:08,042 like repeat three times,\nlooks almost like C as well. 21095 20:09:08,042 --> 20:09:12,452 The only difference, really, is using\n 21096 20:09:12,452 --> 20:09:15,142 And, again, you'll use let to\ncreate a string, or an INT 21097 20:09:15,142 --> 20:09:17,362 or any other type of\nvariable in JavaScript. 21098 20:09:17,361 --> 20:09:21,021 The browser will figure out\nwhat type you mean from context. 21099 20:09:21,021 --> 20:09:24,232 In C we would have said INT instead. 21100 20:09:24,232 --> 20:09:27,952 Ultimately, this language, and that's\n 21101 20:09:27,952 --> 20:09:30,502 There's bunches of other\nfeatures, but syntactically it's 21102 20:09:30,501 --> 20:09:32,841 going to be that accessible,\nrelatively speaking. 21103 20:09:32,842 --> 20:09:35,572 The power of JavaScript\nrunning in the user's browser 21104 20:09:35,572 --> 20:09:39,232 is going to be that you can\nchange this thing in memory. 21105 20:09:39,232 --> 20:09:43,432 Think about most any website, that's\n 21106 20:09:43,432 --> 20:09:45,771 It's typically very\ninteractive and dynamic. 21107 20:09:45,771 --> 20:09:49,581 If you're sitting in front of Gmail on\n 21108 20:09:49,581 --> 20:09:52,141 open, and someone sends you\nan email, all of a sudden 21109 20:09:52,142 --> 20:09:55,402 another row appears in your\ninbox, another row, another row. 21110 20:09:56,721 --> 20:09:58,711 Honestly, it could be an HTML table. 21111 20:09:58,711 --> 20:10:00,741 Maybe it's a bunch of\ndivs top to bottom. 21112 20:10:00,741 --> 20:10:04,072 The point, though, is, you don't\n 21113 20:10:04,072 --> 20:10:06,532 R to reload the page to see more email. 21114 20:10:06,532 --> 20:10:10,072 It automatically appears\nevery few seconds or minutes. 21115 20:10:11,751 --> 20:10:14,811 When you visit Gmail.com,\nyou are downloading not just 21116 20:10:14,812 --> 20:10:18,322 HTML and CSS with your\ninitial inbox, presumably. 21117 20:10:18,322 --> 20:10:20,392 You're downloading some\nJavaScript code, that 21118 20:10:20,392 --> 20:10:24,592 is designed to keep talking every\n 21119 20:10:24,592 --> 20:10:28,282 to Gmail servers, and they,\nthen, are using their code 21120 20:10:28,282 --> 20:10:31,492 to add another element, another\nelement, another element 21121 20:10:31,491 --> 20:10:36,771 to the existing DOM, document object\n 21122 20:10:36,771 --> 20:10:40,851 in memory that represents HTML,\n 21123 20:10:43,232 --> 20:10:46,101 If you click and drag and drag\nand drag, your browser did not 21124 20:10:46,101 --> 20:10:49,351 download the entire world to\nyour Mac or PC by default. 21125 20:10:49,351 --> 20:10:52,491 It only downloaded what's in your\n 21126 20:10:52,491 --> 20:10:55,491 But when you click and drag, it's\n 21127 20:10:55,491 --> 20:10:58,702 some more images, some more\nimages, as you keep dragging, using 21128 20:10:58,702 --> 20:11:01,312 JavaScript, again, behind the scenes. 21129 20:11:01,312 --> 20:11:04,672 So let's actually use JavaScript\n 21130 20:11:05,979 --> 20:11:08,271 We can put the JavaScript\ncode in the head of the page 21131 20:11:08,271 --> 20:11:12,182 in the body of the page, or even\n 21132 20:11:13,131 --> 20:11:17,601 Here is a new version of\nHello.html, that, during the break 21133 20:11:17,601 --> 20:11:20,781 I just added a form to, because it'd\n 21134 20:11:20,782 --> 20:11:24,472 say Hello, title, Hello, body, it said,\n 21135 20:11:25,672 --> 20:11:28,672 I've got a form that I borrowed\nfrom some of our earlier code 21136 20:11:28,672 --> 20:11:34,882 and that form has an input whose ID is\n 21137 20:11:34,881 --> 20:11:36,572 But there's no code in this yet. 21138 20:11:36,572 --> 20:11:39,601 So let's add a little bit of\nJavaScript code as follows. 21139 20:11:39,601 --> 20:11:43,101 Suppose that, when this form is\n 21140 20:11:44,282 --> 20:11:47,302 Well, let's do it the\nsomewhat messy way first. 21141 20:11:47,301 --> 20:11:51,351 I can add an attribute called\non submit to the form element 21142 20:11:51,351 --> 20:11:56,452 and I can say on submit, call the\n 21143 20:11:56,452 --> 20:11:58,552 Unfortunately, this\nfunction doesn't yet exist. 21144 20:11:59,769 --> 20:12:01,101 But there's another detail here. 21145 20:12:01,101 --> 20:12:04,851 When the user clicks submit, normally\n 21146 20:12:04,851 --> 20:12:06,381 I don't want to do that today. 21147 20:12:06,381 --> 20:12:10,372 I want to just submit the form to\n 21148 20:12:10,372 --> 20:12:13,142 and just print to the screen,\nHello, David, or so forth. 21149 20:12:13,142 --> 20:12:15,862 So I'm also going to go\nahead and say, return false. 21150 20:12:15,861 --> 20:12:20,301 And this is a JavaScript way of telling\n 21151 20:12:20,301 --> 20:12:21,831 to submit the form, return false. 21152 20:12:21,831 --> 20:12:24,171 Like, no, don't let them\nactually submit the form. 21153 20:12:24,172 --> 20:12:26,362 But do call this function called greet. 21154 20:12:26,361 --> 20:12:30,951 In the head of my page, I'm going to add\n 21155 20:12:30,952 --> 20:12:33,682 implicitly JavaScript,\nand has no relationship 21156 20:12:33,682 --> 20:12:37,402 for those of you who took APCS with\n 21157 20:12:37,402 --> 20:12:41,542 but no relation, I'm going to\nname a function called Greet. 21158 20:12:41,542 --> 20:12:44,631 Apparently in JavaScript, the\nway you create a function is you 21159 20:12:44,631 --> 20:12:47,361 literally say the word\nfunction instead of Def. 21160 20:12:47,361 --> 20:12:49,701 You don't specify a return type. 21161 20:12:49,702 --> 20:12:54,381 And in this function, I could do\n 21162 20:12:54,381 --> 20:12:57,801 unquote, how about, Hello, there. 21163 20:12:57,801 --> 20:13:01,501 Initially I'm going to keep it simple,\n 21164 20:13:01,501 --> 20:13:03,031 which is not a good user interface. 21165 20:13:03,032 --> 20:13:04,407 There are better ways to do this. 21166 20:13:04,407 --> 20:13:06,382 But we're doing something simple first. 21167 20:13:06,381 --> 20:13:09,232 Let me now go ahead and\nload this page again. 21168 20:13:09,232 --> 20:13:12,562 It still looks as simple as before,\nwith just a simple text box. 21169 20:13:12,562 --> 20:13:13,942 I'll zoom in to make it bigger. 21170 20:13:13,941 --> 20:13:15,471 I'm going to type my name,\nbut I think it's going 21171 20:13:15,471 --> 20:13:17,241 to be ignored when I click Submit. 21172 20:13:18,831 --> 20:13:21,111 And this is, again, this\nis an ugly user interface. 21173 20:13:21,111 --> 20:13:24,381 It literally says the whole\ncode space URL of the web page 21174 20:13:25,581 --> 20:13:29,456 It's really just meant for simple\n 21175 20:13:29,456 --> 20:13:31,581 All right, let's have it\nsay Hello, David, somehow. 21176 20:13:33,032 --> 20:13:37,222 Well, if this element on the\npage was given by me a unique ID 21177 20:13:37,221 --> 20:13:41,331 it'd be nice if, just like in CSS, I\n 21178 20:13:45,572 --> 20:13:49,461 Let me store, in a variable\ncalled name, the result 21179 20:13:49,461 --> 20:13:54,381 of calling a special function\ncalled document.queryselector. 21180 20:13:54,381 --> 20:13:57,682 This query selector function\nis JavaScript's version 21181 20:13:57,682 --> 20:14:00,142 of what we were doing\nin CSS, to select nodes 21182 20:14:00,142 --> 20:14:02,962 using hashes or dots or other syntax. 21183 20:14:04,441 --> 20:14:09,391 So if I want to select the\nelement whose unique ID is name 21184 20:14:09,392 --> 20:14:12,712 I can literally just pass, in\n 21185 20:14:14,211 --> 20:14:17,032 That gives me the actual\nnode from the tree. 21186 20:14:17,032 --> 20:14:21,202 It gives me one of these rectangles\n 21187 20:14:21,202 --> 20:14:24,351 If I actually want to get at\nthe specific value therein 21188 20:14:24,351 --> 20:14:27,301 I need to go one step\nfurther and say .value. 21189 20:14:27,301 --> 20:14:29,301 So, similar in spirit\nto Python, where we 21190 20:14:29,301 --> 20:14:32,211 saw a lot of dot notation, where\nyou can go inside an object 21191 20:14:32,211 --> 20:14:34,191 inside of an object,\nthat's what's going on. 21192 20:14:34,191 --> 20:14:38,091 Long story short, in JavaScript,\n 21193 20:14:38,092 --> 20:14:41,392 called document, that lets you just do\n 21194 20:14:42,021 --> 20:14:44,361 One of those functions\nis called query selector. 21195 20:14:44,361 --> 20:14:47,781 That function returns to you\nwhatever it is you're selecting. 21196 20:14:47,782 --> 20:14:51,032 And dot value means go\ninside of that rectangle 21197 20:14:51,032 --> 20:14:54,272 and grab the actual text\nthat the human typed in. 21198 20:14:54,271 --> 20:14:58,251 So if I want to now say,\nHello, to that person 21199 20:14:58,251 --> 20:15:01,011 the syntax is a little\ndifferent from C and Python. 21200 20:15:01,012 --> 20:15:03,741 I can use concatenation, which\nactually does exist in Python 21201 20:15:04,941 --> 20:15:10,371 I can go ahead and say hello,\nquote unquote "Hello," plus name. 21202 20:15:10,372 --> 20:15:13,131 All right, now, if I go\nback to the browser window 21203 20:15:13,131 --> 20:15:16,911 reload the page, to get the latest\n 21204 20:15:16,911 --> 20:15:20,031 and click Submit, now\nI see, Hello, David. 21205 20:15:20,032 --> 20:15:23,422 Not the best website, but\nit does demonstrate how 21206 20:15:23,422 --> 20:15:25,832 I can start to interact with the page. 21207 20:15:25,831 --> 20:15:29,211 But let me stipulate that\nthis co-mingling of languages 21208 20:15:30,661 --> 20:15:33,171 It's fine to use\nclasses, but using style 21209 20:15:33,172 --> 20:15:35,345 equals quote unquote and\na whole bunch of CSS 21210 20:15:35,345 --> 20:15:38,512 that was not going to scale well, once\n 21211 20:15:38,512 --> 20:15:40,672 Same here, once you\nhave more and more code 21212 20:15:40,672 --> 20:15:44,461 you don't want to just put your code\n 21213 20:15:45,751 --> 20:15:48,301 Let's get rid of that\non summit attribute 21214 20:15:48,301 --> 20:15:50,131 and literally never use it again. 21215 20:15:50,131 --> 20:15:52,411 That was for demonstration's sake only. 21216 20:15:53,801 --> 20:15:57,661 Let me move the script tag,\nactually, just below the form 21217 20:15:57,661 --> 20:16:01,591 but still inside the body, so\nthat the script tag exists only 21218 20:16:01,592 --> 20:16:04,482 after the form tag exists, logically. 21219 20:16:04,482 --> 20:16:08,411 Just like in Python, your code is\n 21220 20:16:10,751 --> 20:16:13,591 Let me define this function\ncalled Greet, and then 21221 20:16:13,592 --> 20:16:18,902 let me do this, document.queryselector,\n 21222 20:16:18,902 --> 20:16:20,232 It doesn't have a unique ID. 21223 20:16:21,062 --> 20:16:24,302 I can just reference it by name, form,\n 21224 20:16:24,301 --> 20:16:28,411 And let me call this special\nfunction, add event listener. 21225 20:16:28,411 --> 20:16:31,471 This is a function that\nlistens for events. 21226 20:16:31,471 --> 20:16:34,201 Now this is actually a term\nof art within programming. 21227 20:16:34,202 --> 20:16:37,172 Many different languages\nare governed by events. 21228 20:16:37,172 --> 20:16:40,652 And pretty much any user interface is\n 21229 20:16:40,652 --> 20:16:45,264 On phones, you have touches, and you\n 21230 20:16:45,264 --> 20:16:47,432 and you have pinch, and all\nof these other gestures. 21231 20:16:47,432 --> 20:16:49,711 On your Mac or PC you\nhave click, you have drag 21232 20:16:49,711 --> 20:16:52,982 you have key down, key up, as\n 21233 20:16:53,861 --> 20:16:57,121 This is a non-exhaustive\nlist of all of the events 21234 20:16:57,122 --> 20:16:59,919 that you can listen for in the\ncontext of web programming. 21235 20:16:59,918 --> 20:17:02,251 And this might be a throwback\nto Scratch, where, recall 21236 20:17:02,251 --> 20:17:04,411 Scratch let you broadcast events. 21237 20:17:04,411 --> 20:17:07,741 And we had the two puppets sort of\n 21238 20:17:07,741 --> 20:17:11,611 In the world of web programming, game\n 21239 20:17:11,611 --> 20:17:14,131 these days, they're\njust governed by events. 21240 20:17:14,131 --> 20:17:16,961 And you write code that listens\nfor these events happening. 21241 20:17:16,961 --> 20:17:18,301 So what do I want to listen for? 21242 20:17:18,301 --> 20:17:21,461 Well, I want to add an event\nlistener for the Submit event. 21243 20:17:21,461 --> 20:17:26,771 And when that happens, I want to\n 21244 20:17:26,771 --> 20:17:28,801 So this is kind of interesting. 21245 20:17:28,801 --> 20:17:32,432 Thank you, I have my Greet\nfunction as before, no changes. 21246 20:17:32,432 --> 20:17:34,952 But I'm adding one\nline of code down here. 21247 20:17:34,952 --> 20:17:38,072 I'm telling the browser to\nuse document.queryselector 21248 20:17:39,331 --> 20:17:42,551 Then I'm adding an event listener,\n 21249 20:17:42,551 --> 20:17:44,671 And when that happens, I call Greet. 21250 20:17:44,672 --> 20:17:47,461 Notice I am not using\nparentheses after Greet. 21251 20:17:47,461 --> 20:17:49,381 I don't want to call Greet right away. 21252 20:17:49,381 --> 20:17:55,891 I want to tell the browser to call\n 21253 20:17:55,892 --> 20:18:03,992 Now let me go ahead and deliberately,\n 21254 20:18:03,991 --> 20:18:08,012 here, let me type in my name,\nDavid, submit, and there we go. 21255 20:18:09,361 --> 20:18:13,231 All right, but let's now make\nthis slightly better designed. 21256 20:18:13,232 --> 20:18:16,572 Right now, I'm defining a\nfunction Greet, which is fine. 21257 20:18:16,572 --> 20:18:18,551 But I'm only using it in one place. 21258 20:18:18,551 --> 20:18:21,122 And you might recall, we\nstumbled on this in Python 21259 20:18:21,122 --> 20:18:24,211 where I was like, why are we creating\n 21260 20:18:24,211 --> 20:18:26,461 when we're only using\nit like one line later? 21261 20:18:26,461 --> 20:18:29,842 And we introduced what type of\nfunction in Python the other day? 21262 20:18:30,601 --> 20:18:33,221 SPEAKER 1: Yeah, so lambda\nfunctions, anonymous functions. 21263 20:18:33,221 --> 20:18:35,611 You can actually do this\nin JavaScript as well. 21264 20:18:35,611 --> 20:18:40,001 If I want to define a function all\n 21265 20:18:40,001 --> 20:18:43,661 Let me cut this onto my\nclipboard, paste it over here. 21266 20:18:43,661 --> 20:18:45,451 Let me fix all of the alignment. 21267 20:18:46,982 --> 20:18:50,672 And I can actually, now, do this. 21268 20:18:50,672 --> 20:18:52,351 The syntax is a little weird. 21269 20:18:52,351 --> 20:18:56,021 But using now just these four\nlines of code, I can do this. 21270 20:18:56,021 --> 20:18:59,531 I can tell the browser to add an\n 21271 20:18:59,532 --> 20:19:03,331 And then when it hears that, call\n 21272 20:19:03,331 --> 20:19:06,151 And unlike Python, this function\ncan have multiple lines 21273 20:19:06,150 --> 20:19:07,650 which is actually a nice thing. 21274 20:19:08,714 --> 20:19:11,131 There's a lot of indentation\nin curly braces going on now. 21275 20:19:11,131 --> 20:19:15,452 But you can think of this as just\n 21276 20:19:15,452 --> 20:19:18,149 when the form is submitted. 21277 20:19:18,149 --> 20:19:20,732 But if I want to block the form\nfrom actually being submitted 21278 20:19:20,732 --> 20:19:21,722 I've got to do one other thing. 21279 20:19:21,721 --> 20:19:24,929 And you would only know this from being\n 21280 20:19:24,929 --> 20:19:28,291 I need to do this\nfunction, prevent default 21281 20:19:28,292 --> 20:19:32,122 passing in this E argument, which\nis a variable that represents 21282 20:19:32,122 --> 20:19:33,872 the event, more on\nthat another time, that 21283 20:19:33,872 --> 20:19:36,482 just allows us to prevent\nwhatever the default 21284 20:19:36,482 --> 20:19:39,512 handling of that particular event is. 21285 20:19:39,512 --> 20:19:42,782 So long story short, this is\nrepresentative of the type of code 21286 20:19:42,782 --> 20:19:45,961 you might write in JavaScript,\nwhereby you can actually interact 21287 20:19:45,961 --> 20:19:48,422 with your code, the user's actual form. 21288 20:19:48,422 --> 20:19:50,192 And we can do interesting things, too. 21289 20:19:50,191 --> 20:19:53,531 Built into browsers nowadays\nis functionality like this. 21290 20:19:53,532 --> 20:19:57,251 So here's a very simple example, that\n 21291 20:19:58,381 --> 20:20:00,751 Well, it turns out using\nJavaScript, you can control 21292 20:20:00,751 --> 20:20:02,881 the CSS of a page programmatically. 21293 20:20:02,881 --> 20:20:05,281 I can change the background\nof the body of the page 21294 20:20:05,282 --> 20:20:10,382 to red, to green, to blue, just by\n 21295 20:20:10,381 --> 20:20:12,482 and then changing CSS properties. 21296 20:20:12,482 --> 20:20:15,271 Just to give you a taste of this,\nif I view the page's source 21297 20:20:15,271 --> 20:20:19,711 similar code here, I can\nselect the red button by an ID 21298 20:20:19,711 --> 20:20:22,441 that I apparently defined\non it, right up here. 21299 20:20:22,441 --> 20:20:25,871 I can add an event listener, this\n 21300 20:20:25,872 --> 20:20:28,199 And when it's clicked, I\nexecute this one line of code. 21301 20:20:28,198 --> 20:20:30,240 And this one line of code\nwe haven't seen before 21302 20:20:30,240 --> 20:20:33,871 but you can go into the body of\nthe page, its style property 21303 20:20:33,872 --> 20:20:36,331 and you can change its\nbackground color to red. 21304 20:20:36,331 --> 20:20:38,851 This is one example of\ntwo different groups 21305 20:20:38,851 --> 20:20:40,562 not talking to one another in advance. 21306 20:20:40,562 --> 20:20:43,911 In CSS, properties that have two\nwords are usually hyphenated 21307 20:20:45,592 --> 20:20:48,501 Unfortunately, in JavaScript, if\n 21308 20:20:48,501 --> 20:20:52,142 that's subtraction, which is\nlogically nonsensical here. 21309 20:20:52,142 --> 20:20:56,932 So in CSS, you can convert\nbackground-color to, in JavaScript 21310 20:20:56,932 --> 20:20:59,601 background Color, where\nyou capitalize the C 21311 20:20:59,601 --> 20:21:01,851 and you get rid of the minus sign. 21312 20:21:03,182 --> 20:21:05,900 Well, back in the day, there\nused to be a blink tag. 21313 20:21:05,900 --> 20:21:07,971 And it's one of the\nfew historical examples 21314 20:21:07,971 --> 20:21:12,682 of a tag that was removed from HTML,\n 21315 20:21:12,682 --> 20:21:14,122 this is what the web looked like. 21316 20:21:14,122 --> 20:21:16,176 There was a lot of this kind of stuff. 21317 20:21:16,176 --> 20:21:18,051 There was even a marquee\nthat would move text 21318 20:21:18,051 --> 20:21:19,509 from left to right over the screen. 21319 20:21:19,509 --> 20:21:21,781 And the web was a very ugly place. 21320 20:21:21,782 --> 20:21:25,071 I will admit, my very first web page\n 21321 20:21:25,070 --> 20:21:26,542 But how can we bring it back? 21322 20:21:26,542 --> 20:21:29,691 Well, this is a version of the\n 21323 20:21:30,202 --> 20:21:35,722 I wrote some code in this example, that\n 21324 20:21:35,721 --> 20:21:39,471 the CSS of the page to be\nvisible, invisible, visible 21325 20:21:39,471 --> 20:21:42,891 invisible, because built into\nJavaScript is support for a clock. 21326 20:21:42,892 --> 20:21:45,452 So you can just do something\non some sort of schedule. 21327 20:21:45,452 --> 20:21:47,782 Let me go ahead and open up\nthis example, autocomplete. 21328 20:21:49,251 --> 20:21:53,751 In Autocomplete.html, I whipped up as\n 21329 20:21:53,751 --> 20:21:56,482 but I also grabbed the\ndictionary from problem 21330 20:21:56,482 --> 20:22:00,711 set 5 speller, so that if I want\n 21331 20:22:00,711 --> 20:22:04,491 this searches that 140,000\nwords, using JavaScript 21332 20:22:04,491 --> 20:22:07,380 to create what we know in the\nworld of the web as autocomplete. 21333 20:22:07,380 --> 20:22:09,172 When you start searching\nfor something, you 21334 20:22:09,172 --> 20:22:11,632 should start to see words\nthat start with that phrase. 21335 20:22:11,631 --> 20:22:14,164 And sure enough, if I search\nfor something like banana 21336 20:22:14,164 --> 20:22:17,331 here's the three variants of bananas\n 21337 20:22:18,381 --> 20:22:21,811 Just JavaScript, when\nit finds matching words 21338 20:22:21,812 --> 20:22:24,862 it's just updating the DOM, the\ntree in the computer's memory 21339 20:22:24,861 --> 20:22:27,921 to show more and more text, or less. 21340 20:22:27,922 --> 20:22:34,162 And for one final example, this is how\n 21341 20:22:35,872 --> 20:22:40,042 You have built into browsers today some\n 21342 20:22:40,042 --> 20:22:44,152 interfaces, whereby you can ask for\n 21343 20:22:44,152 --> 20:22:47,601 For instance, here, I wrote a\n 21344 20:22:47,601 --> 20:22:49,448 apparently asking to know my location. 21345 20:22:49,448 --> 20:22:51,531 All right, let me go ahead\nand allow it this time 21346 20:22:51,532 --> 20:22:54,202 if that's something you're\ncomfortable with on your own device. 21347 20:22:54,202 --> 20:22:57,202 It's taking a moment, because sometimes\n 21348 20:22:58,081 --> 20:23:02,601 But, hopefully, in just a moment, there\n 21349 20:23:02,601 --> 20:23:06,232 and as a final flourish today, for what\n 21350 20:23:06,232 --> 20:23:08,601 for your structure, CSS\nfor your style, and now 21351 20:23:08,601 --> 20:23:13,372 JavaScript for your logic, which we'll\n 21352 20:23:13,372 --> 20:23:15,982 and search Google for\nthose GPS coordinates. 21353 20:23:15,982 --> 20:23:21,472 Zoom in here on Google Maps,\nand if we zoom in, in, in 21354 20:23:22,461 --> 20:23:26,122 We're not on that street, but\nthere, oh, there it is, actually. 21355 20:23:26,122 --> 20:23:27,832 There is the marker it had put for us. 21356 20:23:27,831 --> 20:23:30,118 We're indeed here in Memorial Hall. 21357 20:23:30,119 --> 20:23:32,452 So all that with JavaScript,\nbut the basic understanding 21358 20:23:32,452 --> 20:23:34,252 of the DOM and the\ndocument object model 21359 20:23:34,251 --> 20:23:36,481 we'll pick up where\nwe left off next week. 21360 20:24:57,172 --> 20:25:01,222 So this is CS50 and this is\nweek nine, and this is it 21361 20:25:01,221 --> 20:25:03,471 in terms of programming fundamentals. 21362 20:25:03,471 --> 20:25:06,250 Today, we come rather full circle\nwith so many of the languages 21363 20:25:06,250 --> 20:25:08,542 that we've been looking at\nover the past several weeks. 21364 20:25:08,542 --> 20:25:11,512 And with HTML and CSS\nand JavaScript last week 21365 20:25:11,512 --> 20:25:14,782 we're going to add back into\nthe mix, Python and SQL. 21366 20:25:14,782 --> 20:25:18,182 And with that, do we have the\nability to program for the web. 21367 20:25:18,182 --> 20:25:20,601 And even though this isn't\nthe only user interface out 21368 20:25:20,601 --> 20:25:23,691 there, increasingly-- or people\n 21369 20:25:23,691 --> 20:25:26,721 and a browser to access applications\nthat people have written 21370 20:25:26,721 --> 20:25:31,311 but it's also, increasingly, the way\n 21371 20:25:31,312 --> 20:25:33,532 There are languages\ncalled Swift for iOS 21372 20:25:33,532 --> 20:25:35,692 there are languages\ncalled Java for Android 21373 20:25:35,691 --> 20:25:38,512 but coding applications\nin both of those language 21374 20:25:38,512 --> 20:25:42,292 means knowing twice as many language,\n 21375 20:25:42,952 --> 20:25:45,502 So we're increasingly seeing,\nfor better or for worse 21376 20:25:45,501 --> 20:25:47,601 that the world is starting\nto really standardize 21377 20:25:47,601 --> 20:25:51,501 at least for the next some number of\n 21378 20:25:51,501 --> 20:25:55,491 coupled with other languages like\n 21379 20:25:55,491 --> 20:25:57,929 And so today, we'll tie\nall of those together 21380 20:25:57,929 --> 20:26:00,471 and give you the last of the\ntools in your toolkit with which 21381 20:26:00,471 --> 20:26:03,051 to tackle final projects to\ngo off into the real world 21382 20:26:03,051 --> 20:26:05,961 ultimately, and somehow solve\nproblems with programming. 21383 20:26:05,961 --> 20:26:11,072 But we need an additional tool today,\n 21384 20:26:11,072 --> 20:26:14,239 This is just a program that\ncomes on certain computers 21385 20:26:14,239 --> 20:26:16,072 that you can install\nfor free, happens to be 21386 20:26:16,072 --> 20:26:18,741 written in a language called\nJavaScript, but it's a program 21387 20:26:18,741 --> 20:26:22,101 that we've been using to\nrun a web server in VSCO. 21388 20:26:22,101 --> 20:26:24,932 But you can run it on your own\nMac or PC or anywhere else. 21389 20:26:24,932 --> 20:26:28,702 But all this particular\nHTTP server does is 21390 20:26:28,702 --> 20:26:32,512 serve up static content\nlike HTML files, CSS files 21391 20:26:32,512 --> 20:26:36,982 JavaScript files, maybe images, maybe\n 21392 20:26:36,982 --> 20:26:41,362 It has no ability to really interact\n 21393 20:26:41,361 --> 20:26:46,561 You can create a web form and serve\n 21394 20:26:46,562 --> 20:26:50,392 but if the human types in input into\n 21395 20:26:50,392 --> 20:26:53,842 submit it elsewhere to something like\n 21396 20:26:53,842 --> 20:26:56,961 it's not actually going to go anywhere\n 21397 20:26:56,961 --> 20:26:59,221 process the requests that are coming in. 21398 20:26:59,221 --> 20:27:02,151 So today, we're going to introduce\nanother type of server that 21399 20:27:02,152 --> 20:27:06,112 comes with Python that allows\nus to not only serve web pages 21400 20:27:06,111 --> 20:27:07,731 but also process user input. 21401 20:27:07,732 --> 20:27:11,482 And recall that all that input is\n 21402 20:27:11,482 --> 20:27:14,062 or more deeply inside of\nthose virtual envelopes. 21403 20:27:14,062 --> 20:27:17,902 So here's the canonical URL we talked\n 21404 20:27:20,042 --> 20:27:23,782 And I've highlighted the slash to\n 21405 20:27:23,782 --> 20:27:26,302 like the default folder\nwhere, presumably, there's 21406 20:27:26,301 --> 20:27:29,911 a file called index.html\nor something else in there. 21407 20:27:29,911 --> 20:27:32,122 Otherwise, you might have\na more explicit mention 21408 20:27:32,122 --> 20:27:34,551 of the actual file named file.html. 21409 20:27:34,551 --> 20:27:37,634 You can have folders, as you probably\n 21410 20:27:38,134 --> 20:27:40,221 You can have files in\nfolders like this, and these 21411 20:27:40,221 --> 20:27:44,091 are all examples of what a programmer\n 21412 20:27:44,092 --> 20:27:45,922 So it might not just\nbe a single word, it 21413 20:27:45,922 --> 20:27:49,832 might have multiple slashes and multiple\n 21414 20:27:49,831 --> 20:27:51,831 But this is just more\ngenerally known as a path. 21415 20:27:51,831 --> 20:27:54,470 But there's another term of our,\nthat's essentially equivalent 21416 20:27:54,471 --> 20:27:55,652 that we'll introduce today. 21417 20:27:55,652 --> 20:27:59,062 This is also synonymously\ncalled a route, which is maybe 21418 20:27:59,062 --> 20:28:03,022 a better generic description of what\n 21419 20:28:03,021 --> 20:28:05,512 they don't have to\nmap to, that is, refer 21420 20:28:05,512 --> 20:28:08,572 to a specific folder\nor a specific file, you 21421 20:28:08,572 --> 20:28:11,095 can come up with your\nown routes in a website. 21422 20:28:11,095 --> 20:28:13,762 And just make sure that when the\nuser visits that, you give them 21423 20:28:14,661 --> 20:28:17,369 If they visit something else, you\n 21424 20:28:17,369 --> 20:28:20,241 It doesn't have to map to a very\n 21425 20:28:20,241 --> 20:28:23,421 And if you want to get input from\n 21426 20:28:23,422 --> 20:28:28,222 like q=cats, you can add a question\n 21427 20:28:28,221 --> 20:28:32,061 The key, or the HTTP parameter name\n 21428 20:28:32,062 --> 20:28:35,025 and then equal sum value that,\npresumably, the human typed in. 21429 20:28:35,024 --> 20:28:37,191 If you have more of these,\nyou can put an ampersand 21430 20:28:37,191 --> 20:28:41,902 and then more key equals value pairs\n 21431 20:28:41,902 --> 20:28:46,532 The catch, though, is that using the\n 21432 20:28:46,532 --> 20:28:51,682 we don't really have the ability to\n 21433 20:28:53,601 --> 20:28:56,661 You could have appended question\n 21434 20:28:56,661 --> 20:29:00,561 to any of URLs in your home\npage for problem set eight 21435 20:29:00,562 --> 20:29:04,222 but it doesn't actually do\nanything useful, necessarily 21436 20:29:04,221 --> 20:29:06,111 unless you use some fancy JavaScript. 21437 20:29:06,111 --> 20:29:09,241 The server is not going to bother\neven looking in that for you. 21438 20:29:09,241 --> 20:29:12,141 But today, we're going to\nintroduce using a bit of Python. 21439 20:29:12,142 --> 20:29:15,982 And in fact, we're going to use a web\n 21440 20:29:15,982 --> 20:29:19,312 of using HTTP server alone,\nto automatically, for you 21441 20:29:19,312 --> 20:29:22,192 look for any key value pairs\nafter the question mark 21442 20:29:22,191 --> 20:29:25,881 and then hand them to you in\nthe form of a Python dictionary. 21443 20:29:25,881 --> 20:29:29,781 Recall that a dictionary in Python, a\n 21444 20:29:29,782 --> 20:29:33,917 That seems like a perfect fit\nfor these kinds of parameters. 21445 20:29:33,917 --> 20:29:36,292 And you're not going to have\nto write that code yourself. 21446 20:29:36,292 --> 20:29:39,902 It's going to be handed to you by\n 21447 20:29:39,902 --> 20:29:41,782 So this will be the\nsecond of two frameworks 21448 20:29:41,782 --> 20:29:43,324 really, that we look at in the class. 21449 20:29:43,323 --> 20:29:46,011 And a framework is essentially\na bunch of libraries 21450 20:29:46,012 --> 20:29:48,502 that someone else wrote\nand a set of conventions 21451 20:29:48,501 --> 20:29:49,951 therefore, for doing things. 21452 20:29:49,952 --> 20:29:52,762 So those of you who really\nstarted dabbling with Bootstrap 21453 20:29:52,762 --> 20:29:56,441 this past week to make your home\n 21454 20:29:58,211 --> 20:30:01,872 Well, you're using libraries, code that\n 21455 20:30:01,872 --> 20:30:04,961 maybe some of the JavaScript that\n 21456 20:30:04,961 --> 20:30:08,741 But it's also a framework in the\n 21457 20:30:08,741 --> 20:30:11,262 You have to use Bootstraps\nclasses, and you 21458 20:30:11,262 --> 20:30:15,582 have to lay out your divs or\nyour spans or your table tags 21459 20:30:15,581 --> 20:30:17,424 in a sort of Bootstrap-friendly way. 21460 20:30:17,425 --> 20:30:19,842 And it's not too onerous, but\nyou're following conventions 21461 20:30:19,842 --> 20:30:21,682 that a bunch of humans standardized on. 21462 20:30:21,682 --> 20:30:25,422 So similarly, in the world of\nPython, is there another framework 21463 20:30:25,422 --> 20:30:26,802 we're going to start using today. 21464 20:30:26,801 --> 20:30:30,191 And whereas Bootstrap is\nused for CSS and JavaScript 21465 20:30:30,191 --> 20:30:32,231 Flask is going to be used for Python. 21466 20:30:32,232 --> 20:30:34,932 And it just solves a lot\nof common problems for us. 21467 20:30:34,932 --> 20:30:38,142 It's going to make it easier\nfor us to analyze the URLs 21468 20:30:38,142 --> 20:30:40,782 and get key value pairs,\nit's going to make it easier 21469 20:30:40,782 --> 20:30:43,872 for us to find files or\nimages that the human wants 21470 20:30:43,872 --> 20:30:45,312 to see when visiting our website. 21471 20:30:45,312 --> 20:30:48,204 It's even going to make it easier\nto send emails automatically 21472 20:30:48,203 --> 20:30:49,661 like when someone fills out a form. 21473 20:30:49,661 --> 20:30:52,971 You can dynamically, using code,\nsend them an email as well. 21474 20:30:52,971 --> 20:30:55,271 So Flask, and with it\nsome related libraries 21475 20:30:55,271 --> 20:30:58,512 it's just going to make stuff\nlike that easier for us. 21476 20:30:58,512 --> 20:31:03,101 And to do this, all we have to do\n 21477 20:31:03,101 --> 20:31:05,211 requirements of this framework. 21478 20:31:05,211 --> 20:31:08,411 We're going to have to create a\n 21479 20:31:08,411 --> 20:31:11,411 this is where our web app or\napplication is going to live. 21480 20:31:11,411 --> 20:31:15,881 If we have any libraries that we want to\n 21481 20:31:15,881 --> 20:31:19,512 is to have a very simple text\nfile called requirements.txt 21482 20:31:19,512 --> 20:31:21,851 where you list the names\nof those libraries 21483 20:31:21,851 --> 20:31:26,652 top to bottom, in that text file,\n 21484 20:31:26,652 --> 20:31:30,411 or the import statements that we\n 21485 20:31:30,411 --> 20:31:33,461 We're going to have a static\nfolder or static directory, which 21486 20:31:33,461 --> 20:31:36,611 means any files you create that\nare not ever going to change 21487 20:31:36,611 --> 20:31:39,203 like images, CSS files,\nJavaScript files 21488 20:31:39,203 --> 20:31:40,661 they're going to go in this folder. 21489 20:31:40,661 --> 20:31:43,512 And then lastly, any\nHTML that you write 21490 20:31:43,512 --> 20:31:45,911 web pages you want the\nhuman to see, are going 21491 20:31:45,911 --> 20:31:47,751 to go in a folder called templates. 21492 20:31:47,751 --> 20:31:50,711 So this is, again, evidence of\nwhat we mean by a framework. 21493 20:31:50,711 --> 20:31:52,512 Do you have to make a web app like this? 21494 20:31:52,512 --> 20:31:54,702 No, but if you're using\nthis particular framework 21495 20:31:54,702 --> 20:31:58,752 this is what people decided\nwould be the human conventions. 21496 20:31:58,751 --> 20:32:02,961 If you've heard of other frameworks like\n 21497 20:32:02,961 --> 20:32:06,131 there are just different conventions\n 21498 20:32:06,131 --> 20:32:10,152 Flask is a very nice\nmicroframework in that that's it. 21499 20:32:10,152 --> 20:32:13,631 All you have to do is adhere to\n 21500 20:32:13,631 --> 20:32:16,601 to get some code up and running. 21501 20:32:16,601 --> 20:32:19,381 All right, so let's go\nahead and make a web app. 21502 20:32:19,381 --> 20:32:21,131 Let me go ahead and\nswitch over to VS Code 21503 20:32:21,131 --> 20:32:23,173 here, and let me practice\nwhat I'm preaching here 21504 20:32:25,482 --> 20:32:29,172 And let's go ahead and create\nan application that very simply 21505 20:32:29,172 --> 20:32:31,461 maybe, says hello to the user. 21506 20:32:31,461 --> 20:32:35,922 So something that, initially, is not all\n 21507 20:32:35,922 --> 20:32:37,882 But we'll build on that\nas we've always done. 21508 20:32:37,881 --> 20:32:41,861 So in app.py, what I'm going to do\n 21509 20:32:41,861 --> 20:32:43,182 I had on the screen earlier. 21510 20:32:43,182 --> 20:32:49,601 From Flask, import Flask, with a capital\n 21511 20:32:49,601 --> 20:32:52,152 And I'm also going to\npreemptively import 21512 20:32:52,152 --> 20:32:56,202 a couple of functions,\nrender template, and request. 21513 20:32:56,202 --> 20:32:58,122 More on those in just a bit. 21514 20:32:58,122 --> 20:33:00,762 And then below that, I'm going\nto say, go ahead and do this. 21515 20:33:00,762 --> 20:33:03,491 Give me a web-- a\nvariable called app that's 21516 20:33:03,491 --> 20:33:07,691 going to be the result of calling\n 21517 20:33:07,691 --> 20:33:10,161 this weird incantation here, name. 21518 20:33:10,161 --> 20:33:13,572 So we've seen this a few weeks back\n 21519 20:33:13,572 --> 20:33:16,301 and we had that if main thing\nat the bottom of the screen. 21520 20:33:16,301 --> 20:33:21,851 For now, just know that __name__\n 21521 20:33:21,851 --> 20:33:26,322 And so this line here, simple as\n 21522 20:33:26,322 --> 20:33:30,072 turn this file into a Flask application. 21523 20:33:30,072 --> 20:33:33,792 Flask is a function that just figures\n 21524 20:33:33,792 --> 20:33:37,392 The last thing I'm going to do for this\n 21525 20:33:37,392 --> 20:33:41,052 I'm going to say that I'm\ngoing to have a function called 21526 20:33:41,051 --> 20:33:42,971 index that takes no arguments. 21527 20:33:42,971 --> 20:33:44,891 And whenever this\nfunction is called, I want 21528 20:33:44,892 --> 20:33:50,322 to return the results of rendering\na template called index.html. 21529 20:33:51,277 --> 20:33:53,652 So let's assume there's a file\nsomewhere, haven't created 21530 20:33:55,482 --> 20:33:57,702 But render template\nmeans render this file 21531 20:33:57,702 --> 20:34:01,162 that is printed to the\nuser's screen, so to speak. 21532 20:34:01,161 --> 20:34:03,971 The last thing I'm going to\ndo is I have to tell Flask 21533 20:34:03,971 --> 20:34:06,261 when to call this index function. 21534 20:34:06,262 --> 20:34:11,927 And so I'm going to tell it to define\n 21535 20:34:13,029 --> 20:34:15,072 So let's take a look at\nwhat I just created here. 21536 20:34:15,072 --> 20:34:18,042 This is slightly new syntax, and\nit's really the only weirdness 21537 20:34:18,042 --> 20:34:19,812 that we'll have today in Python. 21538 20:34:19,812 --> 20:34:22,512 This is what's known in Python\nis what's called a decorator. 21539 20:34:22,512 --> 20:34:24,521 A decorator is a\nspecial type of function 21540 20:34:24,521 --> 20:34:27,072 that modifies, essentially,\nanother function. 21541 20:34:27,072 --> 20:34:30,111 For our purposes, just know\nthat on line six this says 21542 20:34:30,111 --> 20:34:33,461 hey Python, define a route\nfor slash, the default 21543 20:34:33,461 --> 20:34:35,262 page on my website application. 21544 20:34:35,262 --> 20:34:37,661 The next two lines, seven\nand eight, say, hey Python 21545 20:34:37,661 --> 20:34:40,391 define a function called\nindex, takes no arguments. 21546 20:34:40,392 --> 20:34:45,282 And the only thing you should ever do is\n 21547 20:34:48,411 --> 20:34:51,461 So really, the next question,\n 21548 20:34:55,251 --> 20:34:56,991 Well, let me go ahead and do that next. 21549 20:34:56,991 --> 20:35:00,262 Let me create a directory called\ntemplates, practicing, again 21550 20:35:01,521 --> 20:35:04,281 So I'm going to create a new\nempty directory called templates 21551 20:35:04,282 --> 20:35:08,002 I'm going to go and\nCD into that directory 21552 20:35:08,001 --> 20:35:11,661 and then do code of index.html. 21553 20:35:11,661 --> 20:35:13,281 So here is going to be my index page. 21554 20:35:13,282 --> 20:35:16,227 And I'm going to do a very\nsimple web page, doc type HTML. 21555 20:35:16,226 --> 20:35:18,351 I'm just going to borrow\nsome stuff from last week. 21556 20:35:18,351 --> 20:35:20,601 HTML language equals English. 21557 20:35:21,771 --> 20:35:25,491 I'll then do a head tag, I'll do a meta\n 21558 20:35:25,491 --> 20:35:28,012 This makes my site recall responsive. 21559 20:35:28,012 --> 20:35:30,741 That is, it just grows and shrink\nto fit the size of the device. 21560 20:35:30,741 --> 20:35:34,221 The initial scale for which is going\n 21561 20:35:34,221 --> 20:35:36,261 is going to be device width. 21562 20:35:36,262 --> 20:35:38,301 So I'm typing this out,\nI have it printed here. 21563 20:35:38,301 --> 20:35:40,251 This is stuff I typically copy paste. 21564 20:35:40,251 --> 20:35:42,951 But then lastly, I'm going to\nadd in my title, which will just 21565 20:35:42,952 --> 20:35:44,512 be hello for the name of this app. 21566 20:35:46,402 --> 20:35:50,271 The body of this tag will be-- 21567 20:35:51,232 --> 20:35:54,902 The body of this page, rather,\nwill just be hello comma world. 21568 20:35:54,902 --> 20:35:58,441 So very uninteresting and really a\n 21569 20:35:58,441 --> 20:36:01,792 But let's go now and experiment\nwith these two files. 21570 20:36:01,792 --> 20:36:03,652 I'm not going to bother\nwith a static folder 21571 20:36:03,652 --> 20:36:06,652 right now, because I don't have any\n 21572 20:36:06,652 --> 20:36:08,752 No images, no CSS, nothing like that. 21573 20:36:08,751 --> 20:36:11,761 And honestly, requirements.txt\nis going to be pretty simple. 21574 20:36:11,762 --> 20:36:15,562 I'm going to go requirements.txt and\n 21575 20:36:15,562 --> 20:36:18,952 access to the Flask library itself. 21576 20:36:18,952 --> 20:36:21,952 All right, but that's the only\n 21577 20:36:21,952 --> 20:36:27,232 All right, so now I have two files,\n 21578 20:36:27,232 --> 20:36:30,711 But index.html thank you is\ninside of my templates directory 21579 20:36:30,711 --> 20:36:33,211 so how do I actually start\na web server last week 21580 20:36:33,211 --> 20:36:34,491 I would have said HTTP server. 21581 20:36:34,491 --> 20:36:36,831 But HTTP server is not a Python thing. 21582 20:36:36,831 --> 20:36:41,152 It has no idea about Flask or\nPython or anything I just wrote. 21583 20:36:41,152 --> 20:36:43,982 HTTP server will just\nspit out static files. 21584 20:36:43,982 --> 20:36:47,286 So if I ran HTTP server, and\nthen I clicked on app.py 21585 20:36:47,286 --> 20:36:49,251 I would literally see my Python code. 21586 20:36:49,251 --> 20:36:53,421 It would not get executed because HTTP\n 21587 20:36:53,422 --> 20:36:57,502 But today, I'm going to run a\n 21588 20:36:57,501 --> 20:37:01,251 So this framework Flask that I\nactually preinstalled in advance 21589 20:37:01,251 --> 20:37:04,941 so it wasn't strictly necessary that I\n 21590 20:37:04,941 --> 20:37:08,902 yet, comes with a program called Flask,\n 21591 20:37:08,902 --> 20:37:12,472 the word run, and when I do that, you'll\n 21592 20:37:12,471 --> 20:37:14,661 week whereby you'll see the name-- 21593 20:37:14,661 --> 20:37:17,629 your URL for your\nunique preview of that. 21594 20:37:17,629 --> 20:37:20,211 You might see a pop up saying\nthat your application is running 21595 20:37:20,211 --> 20:37:22,161 on TCP port, something or other. 21596 20:37:22,161 --> 20:37:24,591 By default, last week,\nwe used port 8080. 21597 20:37:24,592 --> 20:37:27,622 Flask, just because, prefers port 5,000. 21598 20:37:28,702 --> 20:37:31,432 I'm going to go ahead\nand open up this URL now. 21599 20:37:31,432 --> 20:37:33,652 And once it authenticates\nand redirects me 21600 20:37:33,652 --> 20:37:37,222 just to make sure I'm allowed to access\n 21601 20:37:37,221 --> 20:37:40,221 Voila, there's the extent\nof this application. 21602 20:37:40,221 --> 20:37:43,911 If I view source by right-clicking\nor control clicking 21603 20:37:43,911 --> 20:37:45,891 there's my HTML that's been spit out. 21604 20:37:45,892 --> 20:37:48,832 So really, I've just reinvented\nthe wheel from last week 21605 20:37:48,831 --> 20:37:51,271 because there's no dynamism\nnow, nothing at all. 21606 20:37:52,968 --> 20:37:54,801 Let me close the source\nand let me zoom out. 21607 20:37:56,422 --> 20:37:59,601 Let me zoom in now, and I have\na very unique cryptic URL. 21608 20:37:59,601 --> 20:38:01,581 But the point is that\nit ends with nothing. 21609 20:38:01,581 --> 20:38:03,801 Or implicitly, it ends with slash. 21610 20:38:03,801 --> 20:38:05,691 This is just Chrome\nbeing a little helpful. 21611 20:38:05,691 --> 20:38:08,762 It doesn't bother showing you a slash,\n 21612 20:38:08,762 --> 20:38:14,872 But let me do something explicit like\n 21613 20:38:14,872 --> 20:38:17,241 So there's a key value\npair that I've manually 21614 20:38:17,241 --> 20:38:20,661 typed into my URL bar and hit Enter. 21615 20:38:20,661 --> 20:38:22,161 Nothing happens, nothing changes. 21616 20:38:23,661 --> 20:38:27,982 But the opportunity today is to\n 21617 20:38:27,982 --> 20:38:31,172 from that URL and start\ndisplaying it to the user. 21618 20:38:31,172 --> 20:38:35,372 So let me go back over here to\nmy terminal window and code. 21619 20:38:35,372 --> 20:38:37,822 Let me move that down\nto the bottom there. 21620 20:38:37,822 --> 20:38:41,042 And what if I want to\nsay, huh, hello, name. 21621 20:38:41,042 --> 20:38:42,771 I ideally want to say something like-- 21622 20:38:42,771 --> 20:38:44,391 I don't want to hard code\nDavid because then it's never 21623 20:38:44,392 --> 20:38:46,282 going to say hello to anyone else. 21624 20:38:46,282 --> 20:38:51,772 I want to put like a variable name\n 21625 20:38:51,771 --> 20:38:55,311 But it's not an HTML tag, so I\nneed some kind of placeholder. 21626 20:38:57,441 --> 20:39:02,572 If I go back to my Python code, I can\n 21627 20:39:02,572 --> 20:39:06,562 And I can ask Flask to go\ninto the current request 21628 20:39:06,562 --> 20:39:10,432 into its arguments, that is\nin the URL, as they're called 21629 20:39:10,432 --> 20:39:14,721 and get whatever the value of\nthe parameter called name is. 21630 20:39:14,721 --> 20:39:16,792 That puts that into a variable for me. 21631 20:39:16,792 --> 20:39:19,702 And then, in render template--\nthis is one of those functions 21632 20:39:19,702 --> 20:39:21,562 that can take more than one argument. 21633 20:39:21,562 --> 20:39:23,542 If it takes another\nargument, you can pass 21634 20:39:23,542 --> 20:39:25,502 in the name of any variable you want. 21635 20:39:25,501 --> 20:39:29,971 So if I want to pass in my name, I\n 21636 20:39:29,971 --> 20:39:34,731 So this is the name of a variable\n 21637 20:39:34,732 --> 20:39:39,682 This is the actual variable that\nI want to get the value from. 21638 20:39:39,682 --> 20:39:45,682 And now lastly, in my index.html,\n 21639 20:39:45,682 --> 20:39:49,792 is to do two curly braces and\nthen put the name of the variable 21640 20:39:51,354 --> 20:39:53,622 So here's what we mean by a template. 21641 20:39:53,622 --> 20:39:56,711 A template is like a blueprint\nin the real world, where 21642 20:39:56,711 --> 20:39:58,842 it's plans to make something. 21643 20:39:58,842 --> 20:40:02,711 This is the plan to make a web page\n 21644 20:40:02,711 --> 20:40:06,221 but there's this placeholder with\ntwo curly braces here and here 21645 20:40:06,221 --> 20:40:10,641 that says go ahead and plug in the\n 21646 20:40:10,642 --> 20:40:13,212 So in this sense, it's similar\nin spirit to our f strings 21647 20:40:13,211 --> 20:40:14,741 or format strings in Python. 21648 20:40:14,741 --> 20:40:17,658 The syntax is a little different\njust because reasonable people 21649 20:40:17,658 --> 20:40:19,991 disagree, different people,\ndifferent frameworks come up 21650 20:40:19,991 --> 20:40:21,116 with different conventions. 21651 20:40:21,116 --> 20:40:23,982 The convention in Flask,\nin their templates 21652 20:40:23,982 --> 20:40:26,442 is to use two curly braces here. 21653 20:40:26,441 --> 20:40:28,721 The hope is that you, the\nprogrammer, will never 21654 20:40:28,721 --> 20:40:32,241 want to display two curly\nbraces in your actual web page. 21655 20:40:32,241 --> 20:40:34,432 But even if you do,\nthere's a workaround. 21656 20:40:35,572 --> 20:40:39,012 So now let me go ahead and go\nback to my browser tab here. 21657 20:40:39,012 --> 20:40:41,922 Previously, even though\nI added name equals David 21658 20:40:41,922 --> 20:40:44,952 to the end of the URL with a question\n 21659 20:40:44,952 --> 20:40:47,682 But now, hopefully, if\nI made these changes 21660 20:40:47,682 --> 20:40:51,350 let me go ahead and open\nup my terminal window. 21661 20:40:51,350 --> 20:40:55,432 Let me restart Flask so it\nloads my changes by default. 21662 20:40:55,432 --> 20:40:59,380 Let me go back to my hello tab and\n 21663 20:41:01,001 --> 20:41:02,471 And there we go, hello, David. 21664 20:41:02,471 --> 20:41:05,501 I can play around now and I can change\n 21665 20:41:07,842 --> 20:41:10,552 And now we have something more dynamic. 21666 20:41:10,551 --> 20:41:14,741 So the new pieces here are, in\nPython, we have some code here 21667 20:41:14,741 --> 20:41:18,551 that allows us to access,\nprogrammatically, everything 21668 20:41:18,551 --> 20:41:20,711 that's after the\nquestion mark in the URL. 21669 20:41:20,711 --> 20:41:26,622 And the only thing we have to do that\n 21670 20:41:26,622 --> 20:41:28,452 You and I don't have\nto bother figuring out 21671 20:41:28,452 --> 20:41:30,869 where is the question mark,\nwhere is the equal sign, where 21672 20:41:30,869 --> 20:41:32,442 are the ampersands, potentially. 21673 20:41:32,441 --> 20:41:36,221 The framework, Flask,\ndoes all of that for us. 21674 20:41:36,221 --> 20:41:43,251 OK, any questions then on\nthese principles thus far? 21675 20:41:44,557 --> 20:41:50,142 AUDIENCE: Why do you say the\nquestion mark in the URL? 21676 20:41:50,142 --> 20:41:52,572 DAVID: Why do you need a\nquestion mark in the URL? 21677 20:41:52,572 --> 20:41:59,152 The short answer is just because that\n 21678 20:41:59,152 --> 20:42:02,752 If you're making a GET request\nfrom a browser to a server 21679 20:42:02,751 --> 20:42:07,451 the convention, standardized by the\n 21680 20:42:07,452 --> 20:42:11,152 after the so-called route or\npath, then a question mark. 21681 20:42:11,152 --> 20:42:13,601 And it delineates what's\npart of the root or the path 21682 20:42:13,601 --> 20:42:17,751 and what's part of the\nhuman input to the right. 21683 20:42:19,226 --> 20:42:22,872 AUDIENCE: Can you go over again why\n 21684 20:42:23,372 --> 20:42:25,502 This is this annoying\nthing about Python. 21685 20:42:25,501 --> 20:42:29,551 When you pass in parameters,\ntwo functions that have names 21686 20:42:29,551 --> 20:42:32,471 you typically say something\nequals something else. 21687 20:42:32,471 --> 20:42:35,042 So let me make a slight tweak here. 21688 20:42:35,042 --> 20:42:39,372 How about I say name of person here. 21689 20:42:39,372 --> 20:42:44,222 This allows me to invent my\nown variable for my template 21690 20:42:44,221 --> 20:42:46,211 and assign it the value of name. 21691 20:42:46,211 --> 20:42:52,676 I now, though, have to go into my\n 21692 20:42:57,072 --> 20:43:00,452 And so this is just stupid because\nit's unnecessarily verbose. 21693 20:43:00,452 --> 20:43:04,442 So what typically people do is they\n 21694 20:43:04,441 --> 20:43:08,361 itself, even though it looks admittedly\n 21695 20:43:08,361 --> 20:43:10,111 The thing to the left\nof the equal sign is 21696 20:43:10,111 --> 20:43:14,221 the name of the variable you plan to use\n 21697 20:43:14,221 --> 20:43:16,231 is the actual value you're assigning it. 21698 20:43:16,232 --> 20:43:18,092 And this is because its general purpose. 21699 20:43:18,092 --> 20:43:21,332 I could override this and I could\nsay something like name always 21700 20:43:21,331 --> 20:43:23,881 equals Emma, no matter\nwhat that variable is. 21701 20:43:23,881 --> 20:43:26,191 And now if I go back to\nmy browser and reload 21702 20:43:26,191 --> 20:43:30,121 no matter what's in the URL,\nDavid or Carter, It's always-- 21703 20:43:34,142 --> 20:43:36,602 Oh, I didn't change my template back. 21704 20:43:37,142 --> 20:43:40,412 Let me change that back to be\nname, so that it's name there 21705 20:43:41,342 --> 20:43:43,892 But I've hardcoded\nEmma's name, so now we're 21706 20:43:43,892 --> 20:43:47,492 only ever going to see Emma no\nmatter whose name is in the URL. 21707 20:43:48,521 --> 20:43:51,781 All right, so this is\nbad user interface. 21708 20:43:51,782 --> 20:43:54,692 If, in order to get a greeting\nfor the day, you, the user 21709 20:43:54,691 --> 20:43:57,182 have to manually change the\nURL, which none of us ever do. 21710 20:43:57,182 --> 20:43:58,982 This is not how web pages work. 21711 20:43:58,982 --> 20:44:02,972 What is the more normal mechanism\n 21712 20:44:02,971 --> 20:44:06,661 and putting it in that\nURL automatically? 21713 20:44:06,661 --> 20:44:09,542 How did we do that last week? 21714 20:44:11,262 --> 20:44:15,952 AUDIENCE: We have the search\nbar and we [INAUDIBLE] you have 21715 20:44:15,952 --> 20:44:22,101 to make something in there [INAUDIBLE]. 21716 20:44:22,101 --> 20:44:25,732 DAVID: OK, so we did make something in\n 21717 20:44:25,732 --> 20:44:29,601 And specifically, what was the tag\n 21718 20:44:30,831 --> 20:44:32,761 DAVID: Sorry, a little louder? 21719 20:44:36,441 --> 20:44:38,792 DAVID: So the input tag,\ninside of the form tag. 21720 20:44:38,792 --> 20:44:41,536 So in short, forms, or of\ncourse, how the web works 21721 20:44:41,536 --> 20:44:43,411 and how we typically\nget input from the user 21722 20:44:43,411 --> 20:44:46,739 whether it's a button or a text box\n 21723 20:44:46,740 --> 20:44:48,782 So let's go ahead and add\nthat into the mix here. 21724 20:44:48,782 --> 20:44:52,702 So let's enhance this hello app\n 21725 20:44:54,001 --> 20:44:58,101 Let me get rid of this\nname stuff and let me just 21726 20:44:58,101 --> 20:45:03,891 have a very simple index.html file\n 21727 20:45:03,892 --> 20:45:06,242 the user for some input as follows. 21728 20:45:06,241 --> 20:45:10,851 I'm going to go back into my\n 21729 20:45:10,851 --> 20:45:13,851 the user's name, this is the page I'm\n 21730 20:45:14,642 --> 20:45:16,972 So I'm going to create a form tag. 21731 20:45:16,971 --> 20:45:20,631 The method I'm going to use for now\n 21732 20:45:20,631 --> 20:45:23,092 Then, inside of that form, I'm\ngoing to have an input tag. 21733 20:45:23,092 --> 20:45:25,634 And I'm going to turn off\nautocomplete like we did last week. 21734 20:45:25,634 --> 20:45:29,542 I'm going to turn on auto focus, so it\n 21735 20:45:29,542 --> 20:45:33,263 I'm going to give the name\nof this input the name, name. 21736 20:45:33,263 --> 20:45:35,971 Not to be too confusing, but I'm\n 21737 20:45:35,971 --> 20:45:39,531 So it makes sense that the name of the\n 21738 20:45:39,532 --> 20:45:42,232 The placeholder I want the\nhuman to see in light gray text 21739 20:45:42,232 --> 20:45:45,262 will be Name with a capital N,\n 21740 20:45:45,262 --> 20:45:47,512 And then type of this text fiel-- 21741 20:45:47,512 --> 20:45:49,372 type of this input is going to be text. 21742 20:45:49,372 --> 20:45:52,312 Then I'm just going to give myself,\n 21743 20:45:52,312 --> 20:45:54,229 And I don't care what\nit says, it's just going 21744 20:45:54,229 --> 20:45:56,582 to say the default submit terminology. 21745 20:45:56,581 --> 20:46:01,072 Let me go ahead, now, and open\nup my terminal window again. 21746 20:46:01,072 --> 20:46:06,232 Let me go to that same URL\nso that I can see-- whoops. 21747 20:46:11,751 --> 20:46:13,301 So that was just cached from earlier. 21748 20:46:13,301 --> 20:46:16,601 Let me go back to that same\nURL, my GitHub preview.dev URL 21749 20:46:18,021 --> 20:46:19,932 And now, I can type in anything I want. 21750 20:46:19,932 --> 20:46:23,601 The catch, though, is when I click\n 21751 20:46:24,642 --> 20:46:28,002 It does have a default value,\nbut let me go into my index.html 21752 20:46:28,001 --> 20:46:30,761 and let me add, just like we\ndid last week for it, Google. 21753 20:46:30,762 --> 20:46:36,188 Whereas previously, I said something\n 21754 20:46:36,188 --> 20:46:38,021 we're not going to rely\non some third party. 21755 20:46:38,021 --> 20:46:40,482 I'm going to implement\nthe so-called backend 21756 20:46:40,482 --> 20:46:44,752 and I'm going to have the user\n 21757 20:46:44,751 --> 20:46:47,381 not just slash, how about /greet. 21758 20:46:47,381 --> 20:46:48,941 I can make it up, whatever I want. 21759 20:46:48,941 --> 20:46:53,411 Greet feels like a nice operative word,\n 21760 20:46:53,411 --> 20:46:56,411 sent when they click\nSubmit on this form. 21761 20:46:56,411 --> 20:46:59,891 All right, so let's go ahead now\nand go back to my browser tab. 21762 20:46:59,892 --> 20:47:02,592 Let me go ahead, actually,\nand let me reload Flask 21763 20:47:02,592 --> 20:47:05,082 here so that it reloads\nall of my changes. 21764 20:47:05,081 --> 20:47:09,161 Let me reload this tab so that I get\n 21765 20:47:10,271 --> 20:47:13,391 If I view page source, we\nindeed see that my browser 21766 20:47:13,392 --> 20:47:15,252 has downloaded the latest HTML. 21767 20:47:15,251 --> 20:47:16,916 So it definitely has changed. 21768 20:47:16,917 --> 20:47:18,292 Let's go ahead and type in David. 21769 20:47:18,292 --> 20:47:22,062 And when I click Submit\nhere, what's going to happen? 21770 20:47:25,262 --> 20:47:28,381 What's going to happen visually,\nfunctionally, however you 21771 20:47:28,381 --> 20:47:32,301 want to interpret when I click Submit. 21772 20:47:32,801 --> 20:47:34,346 AUDIENCE: [INAUDIBLE] an empty page. 21773 20:47:34,346 --> 20:47:36,471 DAVID: OK, the user's going\nto go to an empty page. 21774 20:47:36,471 --> 20:47:37,763 Pretty good instinct, because-- 21775 20:47:37,763 --> 20:47:40,641 no where else, if I mentioned\n/greet, it doesn't seem to exist. 21776 20:47:40,642 --> 20:47:44,872 How's the URL going to\nchange, just to be clear? 21777 20:47:44,872 --> 20:47:47,122 What's going to appear,\nsuddenly, in the URL? 21778 20:47:53,563 --> 20:47:56,438 Specifically in the URL, something's\n 21779 20:47:56,994 --> 20:47:58,202 AUDIENCE: The key value pair? 21780 20:47:58,202 --> 20:47:59,577 DAVID: The key value pair, right. 21781 20:48:00,661 --> 20:48:02,941 That's why our Google\ntrick last week worked. 21782 20:48:02,941 --> 20:48:05,604 I sort of recreated a\nform on my own website. 21783 20:48:05,604 --> 20:48:08,521 And even though I didn't get around\n 21784 20:48:08,521 --> 20:48:12,182 I can still send the information\n 21785 20:48:13,351 --> 20:48:16,232 to your question earlier, that\nwhenever you submit a form 21786 20:48:16,232 --> 20:48:19,502 it automatically ends up after\na question mark in the URL 21787 20:48:20,432 --> 20:48:23,221 So this both of you are\nright, this is going to break. 21788 20:48:23,221 --> 20:48:26,761 And all three of you are right,\nin effect, 404 not found. 21789 20:48:26,762 --> 20:48:28,211 You can see it in the tab here. 21790 20:48:28,211 --> 20:48:29,711 That's the error that has come back. 21791 20:48:29,711 --> 20:48:33,482 But what's interesting, and most\nimportant, the URL did change. 21792 20:48:33,482 --> 20:48:37,652 And it went to /greet?name=david. 21793 20:48:37,652 --> 20:48:40,021 So I just, now, need to add\nsome logic that actually 21794 20:48:40,021 --> 20:48:41,762 looks for that so-called route. 21795 20:48:41,762 --> 20:48:44,282 So let me go back to my app.py. 21796 20:48:44,282 --> 20:48:49,442 Let me define another route for,\nquote unquote, "slash greet. 21797 20:48:49,441 --> 20:48:52,961 And then, inside of-- under this,\n 21798 20:48:52,961 --> 20:48:56,042 I'll call it greet, but I\ncould call it anything I want. 21799 20:48:56,042 --> 20:48:58,711 No arguments, for now,\nfor this, and then 21800 20:48:58,711 --> 20:49:02,191 let me go ahead and\ndo this in my app.py. 21801 20:49:02,191 --> 20:49:05,051 This time around, I do want\nto get the human's name. 21802 20:49:05,051 --> 20:49:09,061 So let me say requeste.args\nget quote unquote "name" 21803 20:49:09,062 --> 20:49:11,101 and let me store that in\na variable called name. 21804 20:49:11,101 --> 20:49:14,792 Then let me return a\ntemplate, and you know 21805 20:49:14,792 --> 20:49:17,372 what, I'm going to give myself\na new template, greet.html. 21806 20:49:17,372 --> 20:49:19,622 Because this has a different\npurpose, it's not a form. 21807 20:49:19,622 --> 20:49:22,562 I want to say hello to the\nuser in this HTML file 21808 20:49:22,562 --> 20:49:27,722 and I want to pass, into it, the\n 21809 20:49:27,721 --> 20:49:34,081 All right, so now if I go up and\n 21810 20:49:36,241 --> 20:49:39,691 If I go ahead and hit reload or resubmit\n 21811 20:49:46,241 --> 20:49:48,652 Let me try, so let's try this. 21812 20:49:48,652 --> 20:49:50,232 Let's go ahead and reload the page. 21813 20:49:50,232 --> 20:49:51,741 Previously, it was not found. 21814 20:49:51,741 --> 20:49:55,661 Now it's worse, and this is\nthe 500 error, internal server 21815 20:49:55,661 --> 20:49:59,891 error that I promised next week we will\n 21816 20:49:59,892 --> 20:50:01,782 But here we have an\ninternal server error. 21817 20:50:01,782 --> 20:50:05,512 Because it's an internal error, this\n 21818 20:50:05,512 --> 20:50:09,042 So the route was actually found\n 21819 20:50:09,042 --> 20:50:14,502 But if we go into VS Code here and\n 21820 20:50:17,622 --> 20:50:21,042 this is actually a bit misleading. 21821 20:50:32,202 --> 20:50:34,872 OK, here we have this\nerror here, and this 21822 20:50:34,872 --> 20:50:37,241 is where your terminal window\nis going to be helpful. 21823 20:50:37,241 --> 20:50:40,152 In your terminal window,\nby default, is typically 21824 20:50:40,152 --> 20:50:43,271 going to go helpful\nstuff like a log, L-O-G 21825 20:50:43,271 --> 20:50:46,608 of what it is the server\nis seeing from the browser. 21826 20:50:46,608 --> 20:50:48,941 For instance, here's what the\nserver just saw in purple. 21827 20:50:48,941 --> 20:50:54,072 Get /greet?name=david\nusing HTTP version 1.0. 21828 20:50:54,072 --> 20:50:57,551 Here, though, is the status code\nthat the server returned, 500. 21829 20:50:58,601 --> 20:51:02,021 Well, here's where we get these\n 21830 20:51:02,021 --> 20:51:03,851 that help50 might\nultimately help you with 21831 20:51:03,851 --> 20:51:07,164 or here, we might just\nhave a clue at the bottom. 21832 20:51:07,164 --> 20:51:09,581 And this is actually pretty\nclear, even though we've never 21833 20:51:11,892 --> 20:51:14,472 I just didn't create greet.html, right? 21834 20:51:15,498 --> 20:51:17,831 All right, so that must be\nthe last piece of the puzzle. 21835 20:51:17,831 --> 20:51:21,221 And again, representative of how you\n 21836 20:51:21,221 --> 20:51:24,311 let me go into my terminal window. 21837 20:51:24,312 --> 20:51:28,812 After hitting Control C, which\ncancels or interrupts a process 21838 20:51:28,812 --> 20:51:30,792 let me go into my templates directory. 21839 20:51:30,792 --> 20:51:33,672 If I type ls, I only have index.html. 21840 20:51:33,672 --> 20:51:36,222 So let's code up greet.html. 21841 20:51:36,221 --> 20:51:40,121 And in this file let's\nquickly do doc type. 21842 20:51:40,122 --> 20:51:44,922 Doc type HTML, open bracket\nHTML, language equals English. 21843 20:51:44,922 --> 20:51:48,851 Inside of this, I'll have the head tag,\n 21844 20:51:48,851 --> 20:51:53,501 The name is viewport,\nthe content of which is-- 21845 20:51:55,782 --> 20:52:02,982 The content of which is initial scale\n 21846 20:52:02,982 --> 20:52:05,892 Quote unquote, title\nis still going to be 21847 20:52:05,892 --> 20:52:08,292 I'll call this greet\nbecause this is my template. 21848 20:52:08,292 --> 20:52:13,642 And then here, in the body, I'm\ngoing to have hello comma name. 21849 20:52:13,642 --> 20:52:17,862 So I could have kept around the old\n 21850 20:52:17,861 --> 20:52:19,331 essentially, my second template. 21851 20:52:19,331 --> 20:52:22,961 So index.html now is almost the\nsame, but the title is different 21852 20:52:24,342 --> 20:52:27,552 greet.html is almost the same,\nbut it does not have a form. 21853 20:52:27,551 --> 20:52:29,781 It just has the hello comma name. 21854 20:52:29,782 --> 20:52:34,272 So let me now go ahead and\nrerun in the correct directory. 21855 20:52:34,271 --> 20:52:38,842 You have to run Flask wherever app.py\n 21856 20:52:38,842 --> 20:52:41,802 So let me do Flask run to\nget back to where I was. 21857 20:52:41,801 --> 20:52:43,661 Let me go into my other tab. 21858 20:52:43,661 --> 20:52:46,841 Cross my fingers this time\nthat, when I go back to slash 21859 20:52:46,842 --> 20:52:51,792 and I get index.html's form, now\n 21860 20:52:54,491 --> 20:52:58,001 And now we have a full-fledged web\n 21861 20:52:58,001 --> 20:53:03,191 slash and /greet, the latter of\n 21862 20:53:03,191 --> 20:53:05,231 using a template, spits it out. 21863 20:53:05,232 --> 20:53:08,752 But something could go wrong,\nand let's see what happens here. 21864 20:53:08,751 --> 20:53:11,381 Suppose I don't type anything in. 21865 20:53:11,381 --> 20:53:13,631 Let me go here and just click Submit. 21866 20:53:13,631 --> 20:53:16,423 Now, I mean, it looks stupid. 21867 20:53:16,423 --> 20:53:18,381 So there's bunches of\nways we could solve this. 21868 20:53:18,381 --> 20:53:21,822 I could require that the user\nhave input on the previous page 21869 20:53:21,822 --> 20:53:23,955 I could have some kind\nof error check for this. 21870 20:53:23,955 --> 20:53:26,622 But there's another mechanism I\ncan use that I'll just show you. 21871 20:53:26,622 --> 20:53:30,911 It turns out this GET function,\nin the context of HTTP 21872 20:53:30,911 --> 20:53:33,161 and also in general with\nPython dictionaries 21873 20:53:33,161 --> 20:53:35,572 you can actually supply a default value. 21874 20:53:35,572 --> 20:53:39,941 So if there is no name parameter\n 21875 20:53:39,941 --> 20:53:42,591 you can actually give it\na default value like this. 21876 20:53:42,592 --> 20:53:44,771 So I'll say world, for instance. 21877 20:53:46,211 --> 20:53:48,342 Let me type in nothing\nagain and click Submit. 21878 20:53:48,342 --> 20:53:52,032 And hopefully this time,\nI'll do-- oops, sorry. 21879 20:53:52,032 --> 20:53:54,792 Let me restart Flask\nto reload the template. 21880 20:53:54,792 --> 20:53:57,851 Let me go ahead and type nothing\nthis time, clicking Submit. 21881 20:54:05,622 --> 20:54:09,461 Suppose that the reason this-- 21882 20:54:11,021 --> 20:54:15,191 Suppose I just get rid of name\n 21883 20:54:15,191 --> 20:54:17,831 Now I see hello, world,\nand this is a subtlety 21884 20:54:17,831 --> 20:54:19,961 that I didn't intend to get into here. 21885 20:54:19,961 --> 20:54:23,381 When you have question\nmark name equals nothing 21886 20:54:23,381 --> 20:54:25,421 you're passing in\nwhat's called-- whoops. 21887 20:54:25,422 --> 20:54:29,112 When you have greet question\nmark name equals something 21888 20:54:29,111 --> 20:54:31,841 you actually are giving a value to name. 21889 20:54:31,842 --> 20:54:34,482 It is quote unquote\nwith nothing in between. 21890 20:54:34,482 --> 20:54:36,952 That is different from\nhaving no value at all. 21891 20:54:36,952 --> 20:54:40,965 So allow me to just propose\nthat the error here, we 21892 20:54:40,964 --> 20:54:42,881 would want to require\nthis in a different way. 21893 20:54:42,881 --> 20:54:45,372 And probably the most\nrobust way to do this 21894 20:54:45,372 --> 20:54:51,021 would be to go in here, in my HTML, and\n 21895 20:54:51,021 --> 20:54:55,751 Now, if I go back to my form\nafter restarting Flask here 21896 20:54:55,751 --> 20:54:59,711 and I go ahead and click reload\non my form and type in nothing 21897 20:54:59,711 --> 20:55:03,342 and click Submit, now the\nbrowser is going to yell at me. 21898 20:55:03,342 --> 20:55:05,502 But just as a teaser\nfor something we'll be 21899 20:55:05,501 --> 20:55:08,231 doing in the next problem set\nin terms of error checking 21900 20:55:08,232 --> 20:55:14,771 you should never, ever, ever rely on\n 21901 20:55:14,771 --> 20:55:19,301 Because we know, from last week, that\n 21902 20:55:19,301 --> 20:55:21,682 and let me poke around the HTML here. 21903 20:55:21,682 --> 20:55:24,042 Let me go into the body, the form. 21904 20:55:24,042 --> 20:55:26,661 OK, you say required,\nI say not required. 21905 20:55:26,661 --> 20:55:29,721 You can just delete what's\nin the dom, in the browser 21906 20:55:29,721 --> 20:55:32,682 and now I can go ahead\nand submit this form. 21907 20:55:32,682 --> 20:55:34,271 And it appears to be broken. 21908 20:55:34,271 --> 20:55:37,461 Not a big deal with a silly little\n 21909 20:55:37,461 --> 20:55:40,512 But if you're trying to\nrequire that humans actually 21910 20:55:40,512 --> 20:55:43,902 provide input that is necessary for\n 21911 20:55:43,902 --> 20:55:49,452 you don't want to trust that the HTML\n 21912 20:55:49,452 --> 20:55:52,332 All right, any questions,\nthen, on this particular app 21913 20:55:52,331 --> 20:55:56,211 before we add another feature here? 21914 20:55:59,947 --> 20:56:01,911 AUDIENCE: Do you guys [INAUDIBLE]. 21915 20:56:04,255 --> 20:56:05,422 DAVID: Sorry, little louder. 21916 20:56:14,270 --> 20:56:15,812 DAVID: Would it be a problem if what? 21917 20:56:15,812 --> 20:56:17,285 AUDIENCE: You have to [INAUDIBLE]. 21918 20:56:22,122 --> 20:56:24,521 What you should really do is something\n 21919 20:56:24,521 --> 20:56:26,092 where I'm going to start\nerror checking things. 21920 20:56:26,092 --> 20:56:27,792 So let me wave my hands at\nthat and propose that we'll 21921 20:56:27,792 --> 20:56:29,259 solve this better in just a bit. 21922 20:56:29,259 --> 20:56:31,301 But it's not bad to do\nwhat I just did here, it's 21923 20:56:31,301 --> 20:56:34,241 only going to handle one of the\n 21924 20:56:35,751 --> 20:56:38,051 All right, so even though\nthis is new to most of us 21925 20:56:38,051 --> 20:56:43,391 here, consider index.html, my first\n 21926 20:56:45,551 --> 20:56:48,783 What might be arguably badly designed? 21927 20:56:48,783 --> 20:56:50,741 Even though this might\nbe the first time you've 21928 20:56:50,741 --> 20:56:54,461 ever touched web programming like this. 21929 20:56:54,461 --> 20:57:01,902 What's bad or dumb about this\n 21930 20:57:01,902 --> 20:57:05,831 And there's a reason, too, that I bored\n 21931 20:57:06,642 --> 20:57:11,415 AUDIENCE: [INAUDIBLE] you said,\n 21932 20:57:11,414 --> 20:57:13,081 DAVID: Yeah, there's so much repetition. 21933 20:57:13,081 --> 20:57:16,322 I mean, it was deliberately tedious\n 21934 20:57:16,322 --> 20:57:19,351 The doc type, the HTML tag,\nthe head tag, the title tag. 21935 20:57:19,351 --> 20:57:21,991 And little things did change\nalong the way, like the title 21936 20:57:21,991 --> 20:57:24,031 and certainly, the content of the body. 21937 20:57:24,032 --> 20:57:27,302 But so much of this, I\nmean, almost all of the page 21938 20:57:27,301 --> 20:57:30,599 is a copy of itself in multiple files. 21939 20:57:30,599 --> 20:57:33,932 And God forbid we have a third template,\n 21940 20:57:35,072 --> 20:57:37,441 This is going to get very\ntedious very quickly. 21941 20:57:37,441 --> 20:57:39,970 And suppose you want to\nchange something in one place 21942 20:57:39,970 --> 20:57:43,262 you're going to have to change it now in\n 21943 20:57:43,961 --> 20:57:46,741 So just like in\nprogramming more generally 21944 20:57:46,741 --> 20:57:49,230 we have this ability to\nfactor out commonalities. 21945 20:57:49,230 --> 20:57:51,480 So do you in the context\nof web programming 21946 20:57:51,480 --> 20:57:54,271 and specifically\ntemplating, have the ability 21947 20:57:54,271 --> 20:57:56,379 to factor out all of\nthose commonalities. 21948 20:57:56,379 --> 20:57:58,171 The syntax is going to\nbe a little curious 21949 20:57:58,171 --> 20:58:01,092 but it functionally is\npretty straightforward. 21950 20:58:02,292 --> 20:58:06,301 Let me go ahead and copy\nthe contents of index.html. 21951 20:58:06,301 --> 20:58:09,182 Let me go into my templates\ndirectory and code a file that 21952 20:58:09,182 --> 20:58:12,131 by default, is called layout.html. 21953 20:58:12,131 --> 20:58:15,751 And let me go ahead, and per\nyour answer, copy all of those 21954 20:58:15,751 --> 20:58:18,282 commonalities into\nthis file now instead. 21955 20:58:18,282 --> 20:58:21,151 So here I have a file\ncalled layout.html. 21956 20:58:21,150 --> 20:58:26,311 I don't want to give every page the same\n 21957 20:58:26,312 --> 20:58:27,782 I'm going to call everything hello. 21958 20:58:27,782 --> 20:58:30,662 But in the body of the page,\nwhat I'm going to do here is just 21959 20:58:30,661 --> 20:58:35,262 have a placeholder for actual\ncontents that do change. 21960 20:58:35,262 --> 20:58:38,131 So in this layout, I'm\ngoing to go ahead in here 21961 20:58:38,131 --> 20:58:42,822 and just put in the body of my\npage, how about this syntax? 21962 20:58:44,372 --> 20:58:48,782 Block body, and then percent\nsign close curly brace. 21963 20:58:48,782 --> 20:58:50,972 And then I'm going to do end block. 21964 20:58:50,971 --> 20:58:55,891 So a curious syntax here, but\nthis is more template syntax. 21965 20:58:55,892 --> 20:58:59,222 The other template syntax we saw\n 21966 20:58:59,221 --> 20:59:01,051 That's for just plugging in values. 21967 20:59:01,051 --> 20:59:05,311 There's this other syntax with Flask\n 21968 20:59:05,312 --> 20:59:10,112 brace, a percent sign, and then some\n 21969 20:59:10,892 --> 20:59:13,382 And this one's a little weird\nbecause there's literally 21970 20:59:13,381 --> 20:59:16,721 nothing between the close curly\nand the open curly brace here. 21971 20:59:16,721 --> 20:59:19,201 But let's see what this can do for us. 21972 20:59:19,202 --> 20:59:25,682 Let me now go into my index.html, which\n 21973 20:59:25,682 --> 20:59:28,952 from, and let me focus on\nwhat is minimally different. 21974 20:59:28,952 --> 20:59:33,881 The only thing that's really different\n 21975 20:59:33,881 --> 20:59:37,682 So let me go ahead and just cut\nthat form out to my clipboard. 21976 20:59:37,682 --> 20:59:40,441 Let me change the first\nline of index.html 21977 20:59:40,441 --> 20:59:46,741 to say this file is going\nto extend layout.html 21978 20:59:46,741 --> 20:59:48,932 and notice I'm using\nthe curly braces again. 21979 20:59:48,932 --> 20:59:52,381 And this file is going to\nhave its own body block 21980 20:59:52,381 --> 20:59:57,301 inside of which is just\nthe HTML that I actually 21981 20:59:57,301 --> 20:59:59,792 want to make specific to this page. 21982 20:59:59,792 --> 21:00:02,051 And I'll keep my indentation\nnice and neat here. 21983 21:00:02,051 --> 21:00:03,542 And let's consider what I've done. 21984 21:00:03,542 --> 21:00:05,971 This is starting to look\nweird fast, and this is now 21985 21:00:05,971 --> 21:00:09,871 a mix of HTML with templating code. 21986 21:00:09,872 --> 21:00:15,961 Index.html, first line now says, hey,\n 21987 21:00:17,072 --> 21:00:20,402 This next line, three through\n10, says, hey, Flask, here 21988 21:00:20,402 --> 21:00:23,732 is what I consider my body block to be. 21989 21:00:23,732 --> 21:00:27,362 Plug this into the layout placeholder. 21990 21:00:27,361 --> 21:00:33,611 Therefore, so if I now go back\nto layout.html, and layout.html 21991 21:00:33,611 --> 21:00:35,671 it's almost all HTML by contrast. 21992 21:00:35,672 --> 21:00:38,757 But there is this placeholder, and\n 21993 21:00:39,631 --> 21:00:41,402 If I want to put a\ndefault value, I could 21994 21:00:41,402 --> 21:00:45,002 put a default value there just in case\n 21995 21:00:45,001 --> 21:00:47,171 But in general, that's\nnot going to be relevant. 21996 21:00:47,172 --> 21:00:50,882 So this is just a placeholder,\n 21997 21:00:50,881 --> 21:00:54,491 plug in the page-specific\ncontent right here. 21998 21:00:54,491 --> 21:00:58,262 So if I go now into greet.html,\nthis one's even easier. 21999 21:00:58,262 --> 21:01:01,262 I'm going to cut this content\nand get rid of everything else. 22000 21:01:01,262 --> 21:01:06,842 Greet.html 2 is going to extend\n 22001 21:01:06,842 --> 21:01:13,512 and then I'm going to have my body block\n 22002 21:01:13,512 --> 21:01:15,961 And then I'm going to go\nahead and end that block here. 22003 21:01:15,961 --> 21:01:18,812 These are not HTML tags,\nthis is not HTML syntax. 22004 21:01:18,812 --> 21:01:22,741 Technically, the syntax we keep\nseeing with the curly braces 22005 21:01:22,741 --> 21:01:28,982 and these now curly braces with percent\n 22006 21:01:28,982 --> 21:01:33,812 J-I-N-J-A, which is a language,\nthat some humans invented 22007 21:01:33,812 --> 21:01:35,732 for this purpose of templating. 22008 21:01:35,732 --> 21:01:37,952 And the people who\ninvented Flask decided 22009 21:01:37,952 --> 21:01:40,052 we're not going to come\nup with our own syntax 22010 21:01:40,051 --> 21:01:44,066 we're going to use these other\n 22011 21:01:44,066 --> 21:01:46,441 So again, there starts to be\nat this point in the course 22012 21:01:46,441 --> 21:01:50,371 and really in computing, a lot of\n 22013 21:01:50,952 --> 21:01:54,842 So Flask is using this syntax, but\n 22014 21:01:57,032 --> 21:02:01,832 All right, so now\nindex.html is half HTML 22015 21:02:01,831 --> 21:02:04,261 half templating code, Jinja syntax. 22016 21:02:04,262 --> 21:02:08,012 Greet.html is almost all\nJinja syntax, no tags even 22017 21:02:08,012 --> 21:02:11,582 but because they both\nextend layout.html 22018 21:02:11,581 --> 21:02:14,911 now I think I've improved\nthe design of this thing. 22019 21:02:14,911 --> 21:02:18,941 If I go back to app.py, none\nof this really needs to change. 22020 21:02:18,941 --> 21:02:22,111 I don't change my templates\nto mention layout.html 22021 21:02:22,111 --> 21:02:25,621 that's already implicit in the fact\n 22022 21:02:25,622 --> 21:02:28,741 So now if I go ahead and\nopen my terminal window 22023 21:02:28,741 --> 21:02:32,822 go back to the same folder\nas app.py and do Flask run 22024 21:02:32,822 --> 21:02:35,732 all right, my application\nis running on port 5000. 22025 21:02:35,732 --> 21:02:39,392 Let me now go back to the /route\nin my browser and hit Enter 22026 21:02:40,682 --> 21:02:44,252 And just as a little check, let\nme view the source of the page 22027 21:02:45,932 --> 21:02:48,361 And there's all of the code. 22028 21:02:48,361 --> 21:02:51,421 No mention of Jinja, no curly\nbraces, no percent signs. 22029 21:02:52,096 --> 21:02:54,721 It's not quite pretty printed in\nthe same way, but that's fine. 22030 21:02:54,721 --> 21:02:57,263 Because now, we're starting to\ndynamically generate websites. 22031 21:02:57,263 --> 21:03:00,512 And by that, I mean this isn't\n 22032 21:03:01,111 --> 21:03:03,871 If it's indented in the\nsource code version 22033 21:03:03,872 --> 21:03:05,911 doesn't matter what the\nbrowser really sees. 22034 21:03:05,911 --> 21:03:08,402 Let me now go ahead and type\nin my name, click Submit. 22035 21:03:08,402 --> 21:03:10,051 I should see, yep, hello, David. 22036 21:03:10,051 --> 21:03:12,152 Let me go ahead and view\nthe source of this page. 22037 21:03:12,152 --> 21:03:16,211 And we'll see almost the same\n 22038 21:03:16,211 --> 21:03:19,741 So this is, now, web programming\nin the literal sense. 22039 21:03:19,741 --> 21:03:23,042 I did not hard code a page that says\n 22040 21:03:24,032 --> 21:03:27,752 I hardcoded a page that has a\ntemplate with a placeholder 22041 21:03:27,751 --> 21:03:31,471 and now I'm using actual\nlogic, some code in app.py 22042 21:03:31,471 --> 21:03:37,751 to actually tell the server\nwhat to send to the browser. 22043 21:03:37,751 --> 21:03:42,901 All right, any questions,\nthen, on where we're at here? 22044 21:03:42,902 --> 21:03:45,271 This is now a web application. 22045 21:03:45,271 --> 21:03:49,051 Simple though it is, it's\nno longer just a web site. 22046 21:03:49,922 --> 21:03:54,521 AUDIENCE: Is what we did just better\n 22047 21:03:54,521 --> 21:03:57,232 DAVID: It better for\ndesign or for memory? 22048 21:03:57,732 --> 21:03:59,922 It's definitely better\nfor design because, truly 22049 21:03:59,922 --> 21:04:02,112 if we had a third page,\nfourth page, I would really 22050 21:04:02,111 --> 21:04:03,911 start just resorting to copy paste. 22051 21:04:03,911 --> 21:04:07,031 And as you saw with home page,\noften, in the head of your page 22052 21:04:07,032 --> 21:04:10,632 you might want to include some CSS\n 22053 21:04:10,631 --> 21:04:13,122 You might want to have\nother information up there. 22054 21:04:13,122 --> 21:04:16,754 If you had to upgrade the version of\n 22055 21:04:16,754 --> 21:04:18,461 so you want to change\none of those lines 22056 21:04:18,461 --> 21:04:21,851 you would literally have to go into\n 22057 21:04:26,501 --> 21:04:30,731 Theoretically, the server, because\n 22058 21:04:30,732 --> 21:04:33,394 it can theoretically do some\noptimizations underneath the hood. 22059 21:04:33,394 --> 21:04:36,101 Flask is probably doing that, but\n 22060 21:04:36,101 --> 21:04:38,021 We're using it in\ndevelopment mode, which 22061 21:04:38,021 --> 21:04:41,592 means it's typically\nreloading things each time. 22062 21:04:41,592 --> 21:04:44,802 Other questions on this application? 22063 21:04:48,842 --> 21:04:54,461 All right, so let me ask a question,\n 22064 21:04:54,461 --> 21:04:56,641 What about the implications for privacy? 22065 21:04:56,642 --> 21:05:02,192 Why is this maybe not the best design\n 22066 21:05:05,066 --> 21:05:06,491 AUDIENCE: For some reason,\nyou wanted your name. 22067 21:05:06,491 --> 21:05:08,682 So these private people\ncould just look at the URL. 22068 21:05:09,182 --> 21:05:11,042 I mean, if you have a\nnosy sibling or roommate 22069 21:05:11,042 --> 21:05:13,051 and they have access to\nyour laptop and they just 22070 21:05:13,051 --> 21:05:15,301 go trolling through your\nautocomplete or your history 22071 21:05:15,301 --> 21:05:18,331 like, literally what you typed into\n 22072 21:05:18,331 --> 21:05:21,331 Not a big deal if it's your name, but\n 22073 21:05:21,331 --> 21:05:23,551 card or anything else\nthat's mildly sensitive 22074 21:05:23,551 --> 21:05:26,941 you probably don't want it\nending up in the URL at all 22075 21:05:26,941 --> 21:05:29,402 even if you're in\nincognito mode or whatnot. 22076 21:05:29,402 --> 21:05:34,152 You just don't want to expose yourself\n 22077 21:05:34,152 --> 21:05:35,941 So perhaps, we can do better than that. 22078 21:05:35,941 --> 21:05:38,311 And fortunately, this one\nis actually an easy change. 22079 21:05:38,312 --> 21:05:43,502 Let me go into my\nindex.html where my form is. 22080 21:05:43,501 --> 21:05:48,271 And in my form, I can just change\nthe method from GET to POST. 22081 21:05:48,271 --> 21:05:50,911 It's still going to send key\nvalue pairs to the server 22082 21:05:50,911 --> 21:05:52,951 but it's not going to\nput them in the URL. 22083 21:05:52,952 --> 21:05:56,256 The upside of which is that we\ncan assuage this privacy concern 22084 21:05:56,256 --> 21:05:58,381 but I'm going to have to\nmake one other change too. 22085 21:05:58,381 --> 21:06:02,851 Because now, if I go ahead and run\n 22086 21:06:02,851 --> 21:06:07,331 and I now reload the form to make\n 22087 21:06:07,331 --> 21:06:10,171 You should be in the habit\nof going to View, Developer 22088 21:06:10,172 --> 21:06:12,783 View Source, or Developer\nTools just to make sure 22089 21:06:12,783 --> 21:06:15,241 that what you're seeing in your\nbrowser is what you intend. 22090 21:06:15,241 --> 21:06:18,062 And yes, I do see what I wanted. 22091 21:06:19,652 --> 21:06:22,381 Let me go ahead and type\nin David and click Submit. 22092 21:06:22,381 --> 21:06:24,872 Now I get a different error. 22093 21:06:24,872 --> 21:06:28,902 This one is HTTP 405,\nmethod not allowed. 22094 21:06:30,001 --> 21:06:34,081 Well, in my Flask application, I've\n 22095 21:06:34,081 --> 21:06:37,601 One of which is for slash,\nthen that worked fine. 22096 21:06:37,601 --> 21:06:40,751 One of which is for /greet,\nand that used to work fine. 22097 21:06:40,751 --> 21:06:46,301 But apparently, what Flask is doing\n 22098 21:06:46,301 --> 21:06:51,512 So if I want to change this route to\n 22099 21:06:51,512 --> 21:06:56,682 quote unquote "POST" inside\nof this parameter here. 22100 21:06:56,682 --> 21:07:01,592 So that now, I can actually\nsupport POST, not just GET. 22101 21:07:01,592 --> 21:07:08,642 And if I now restart Flask, so Flask\n 22102 21:07:08,642 --> 21:07:11,282 Let me go back one screen\nto the form, reload 22103 21:07:11,282 --> 21:07:13,202 the page just to make\nsure I have the latest 22104 21:07:13,202 --> 21:07:14,785 even though nothing there has changed. 22105 21:07:14,785 --> 21:07:18,392 Type David and click Submit now,\nnow I should see hello, world. 22106 21:07:18,392 --> 21:07:25,142 Notice that I'm at the greet route,\n 22107 21:07:28,021 --> 21:07:30,331 All right, so that's an\ninteresting takeaway. 22108 21:07:30,331 --> 21:07:35,141 It's a simple change, but whereas GET\n 22109 21:07:35,142 --> 21:07:37,382 But it still works so long\nas you tweak the backend 22110 21:07:37,381 --> 21:07:41,971 to look as a POST request, which\n 22111 21:07:41,971 --> 21:07:44,521 It's not going to be as simple\nas looking at the URL itself. 22112 21:07:44,521 --> 21:07:46,322 Why shouldn't we just always use POST? 22113 21:07:49,491 --> 21:07:53,751 Why not use POST everywhere? 22114 21:07:55,342 --> 21:07:58,822 Right, because it's obnoxious to\n 22115 21:07:58,822 --> 21:08:02,084 if you're leaving these little\n 22116 21:08:02,084 --> 21:08:04,042 can poke around and see\nwhat you've been doing. 22117 21:08:09,138 --> 21:08:11,523 AUDIENCE: You're supposed\nto duplicate [INAUDIBLE].. 22118 21:08:15,532 --> 21:08:18,792 I mean, if you get rid of GET\n 22119 21:08:18,792 --> 21:08:22,194 your history, your autocomplete,\ngets pretty less useful. 22120 21:08:22,194 --> 21:08:25,152 Because none of the information is\n 22121 21:08:25,152 --> 21:08:26,569 go through the menu and hit Enter. 22122 21:08:26,569 --> 21:08:28,182 You'd have to re-fill out the form. 22123 21:08:28,182 --> 21:08:30,389 And there's this other\nsymptom that you can see here. 22124 21:08:30,389 --> 21:08:33,014 Let me zoom out and let\nme just reload this page. 22125 21:08:33,014 --> 21:08:34,932 Notice that you'll get\nthis warning, and it'll 22126 21:08:34,932 --> 21:08:39,551 look different in Safari and Firefox\n 22127 21:08:40,342 --> 21:08:43,932 So your browser might remember what\n 22128 21:08:43,932 --> 21:08:45,711 but just while you're on the page. 22129 21:08:45,711 --> 21:08:50,562 And this is in contrast to GET,\nwhere the state is information. 22130 21:08:50,562 --> 21:08:53,957 Like, key value pairs is\nembedded in the URL itself. 22131 21:08:53,956 --> 21:08:56,081 And if you looked at an\nemail I sent earlier today 22132 21:08:56,081 --> 21:08:59,277 I deliberately linked\nto https://www.google.c 22133 21:08:59,277 --> 21:09:00,881 om/search?q=what+time+is+it. 22134 21:09:06,971 --> 21:09:11,471 This is, by definition, a GET\nrequest when you click on it. 22135 21:09:11,471 --> 21:09:14,861 Because it's going to grab the\ninformation, the key value pair 22136 21:09:14,861 --> 21:09:18,171 from the URL, send it to Google\n 22137 21:09:18,172 --> 21:09:20,592 And the reason I sent this\nvia email earlier was I 22138 21:09:20,592 --> 21:09:23,812 wanted people to very quickly be able\n 22139 21:09:23,812 --> 21:09:27,732 And so I can sort automate the process\n 22140 21:09:27,732 --> 21:09:29,952 but that you induce when\nyou click that link. 22141 21:09:29,952 --> 21:09:35,381 If Google did not support GET, they\n 22142 21:09:35,381 --> 21:09:38,171 is send you all to this\nURL which, unfortunately 22143 21:09:39,762 --> 21:09:42,972 I would have had to add to my\n 22144 21:09:44,812 --> 21:09:46,522 So it's just bad for usability. 22145 21:09:46,521 --> 21:09:49,721 So there, too, we might have design\n 22146 21:09:49,721 --> 21:09:53,261 but also the design when it comes\nto the user experience, or UX 22147 21:09:53,262 --> 21:09:54,941 as a computer scientist would call it. 22148 21:09:54,941 --> 21:09:58,211 Just in terms of what you want\nto optimize for, ultimately. 22149 21:09:58,211 --> 21:10:00,081 So GET and POST both have their roles. 22150 21:10:00,081 --> 21:10:02,581 It depends on what kind of\nfunctionality you want to provide 22151 21:10:02,581 --> 21:10:06,911 and what kind of sensitivity\nthere might be around it. 22152 21:10:06,911 --> 21:10:10,122 All right, any questions, then, on\n 22153 21:10:10,122 --> 21:10:13,722 Super simple, just gets someone's\nname and prints it back out. 22154 21:10:13,721 --> 21:10:16,031 But we now have all\nthe plumbing with which 22155 21:10:16,032 --> 21:10:19,032 to create really most anything we want. 22156 21:10:21,820 --> 21:10:24,112 All right, let's go ahead\nand take a five minute break. 22157 21:10:24,111 --> 21:10:27,951 And when we come back, we'll add to\n 22158 21:10:30,202 --> 21:10:32,182 And recall that the last\nthing we just changed 22159 21:10:32,182 --> 21:10:34,941 was the route to use\nPOST instead of GET. 22160 21:10:34,941 --> 21:10:37,851 So gone is my name and\nany value in the URL. 22161 21:10:37,851 --> 21:10:44,031 But there was a subtle bug or change\n 22162 21:10:44,032 --> 21:10:47,062 I did type David into the\nform and I did click Submit 22163 21:10:47,062 --> 21:10:50,452 and yet here it is\nsaying hello comma world. 22164 21:10:50,452 --> 21:10:53,491 So that seems to be\nbroken all of a sudden 22165 21:10:53,491 --> 21:10:56,572 even though we added support for POST. 22166 21:10:56,572 --> 21:10:58,792 But something must be wrong. 22167 21:10:58,792 --> 21:11:01,342 Logically, it must be the case here. 22168 21:11:01,342 --> 21:11:05,461 Intuitively, that if I'm seeing\n 22169 21:11:07,161 --> 21:11:10,101 It must be that it's\nnot seeing a key called 22170 21:11:10,101 --> 21:11:14,482 name in request.args, which is this. 22171 21:11:14,482 --> 21:11:16,851 Gives you access to\neverything after the URL. 22172 21:11:16,851 --> 21:11:19,471 That's because there's this\nother thing we should know about 22173 21:11:19,471 --> 21:11:21,741 which is not just\nrequest.args but request.form. 22174 21:11:21,741 --> 21:11:26,211 These are horribly named, but\nrequest.args is for GET requests 22175 21:11:26,211 --> 21:11:28,804 request.form is for POST requests. 22176 21:11:28,804 --> 21:11:31,012 Otherwise, they're pretty\nmuch functionally the same. 22177 21:11:31,012 --> 21:11:33,592 But the onus is on you,\nthe user or the programmer 22178 21:11:33,592 --> 21:11:35,832 to make sure you're using the right one. 22179 21:11:35,831 --> 21:11:38,671 So I think if we want\nto get rid of the world 22180 21:11:38,672 --> 21:11:41,002 and actually see what\nI, the human, typed in 22181 21:11:41,001 --> 21:11:45,201 I think I can just change\nrequest.args to request.form. 22182 21:11:45,202 --> 21:11:47,782 Still dot get, still\nquote unquote "name 22183 21:11:47,782 --> 21:11:52,252 and now, if I go ahead and rerun\nFlask in my terminal window 22184 21:11:52,251 --> 21:11:54,776 go back to my browser, go\nback to-- and actually 22185 21:11:54,777 --> 21:11:56,152 I won't even go back to the form. 22186 21:11:56,152 --> 21:11:59,392 I will literally just reload,\nCommand R or Control R 22187 21:11:59,392 --> 21:12:02,572 and what this warning is\nsaying is it's going to submit 22188 21:12:02,572 --> 21:12:05,001 the same information to the website. 22189 21:12:05,001 --> 21:12:08,901 When I click Continue, now I\nshould see hello comma David. 22190 21:12:08,902 --> 21:12:11,182 So again, you, too, are\ngoing to encounter, probably 22191 21:12:11,182 --> 21:12:12,801 all these little subtleties. 22192 21:12:12,801 --> 21:12:15,801 But if you focus on, really, the\nfirst principles of last week 22193 21:12:15,801 --> 21:12:18,682 like what it HTTP, how\ndoes it get request work 22194 21:12:18,682 --> 21:12:20,786 how does a POST request\nwork now, you should 22195 21:12:20,786 --> 21:12:22,911 have a lot of the mental\nbuilding blocks with which 22196 21:12:22,911 --> 21:12:25,101 to solve problems like these. 22197 21:12:25,101 --> 21:12:28,342 And let me give you one other mental\n 22198 21:12:28,342 --> 21:12:32,662 This framework called Flask is just an\n 22199 21:12:32,661 --> 21:12:36,262 that all implement the same\nparadigm, the same way of thinking 22200 21:12:36,262 --> 21:12:38,482 and the same way of\nprogramming applications. 22201 21:12:38,482 --> 21:12:41,932 And that's known as MVC,\nmodel view controller. 22202 21:12:41,932 --> 21:12:46,432 And here's a very simple diagram\n 22203 21:12:46,432 --> 21:12:48,301 and I have been implementing thus far. 22204 21:12:48,301 --> 21:12:51,292 And actually, this is more than\n 22205 21:12:51,292 --> 21:12:55,123 In app.py is what a programmer\n 22206 21:12:55,123 --> 21:12:57,081 That's the code you're\nwriting, this are called 22207 21:12:57,081 --> 21:13:01,131 business logic that makes all of the\n 22208 21:13:01,131 --> 21:13:03,471 what values to show, and so forth. 22209 21:13:03,471 --> 21:13:09,861 In layout.html, index.html, greet.html\n 22210 21:13:09,861 --> 21:13:12,651 that is the visualizations\nthat the human actually 22211 21:13:14,271 --> 21:13:18,561 Those things are dumb, they pretty\n 22212 21:13:18,562 --> 21:13:20,991 All of the hard work is done in app.py. 22213 21:13:20,991 --> 21:13:26,001 So controller, AKA app.py, is where\n 22214 21:13:26,001 --> 21:13:31,491 And in your view is where your HTML and\n 22215 21:13:31,491 --> 21:13:35,512 the curly braces, the curly braces\n 22216 21:13:35,512 --> 21:13:39,411 We haven't added an M\nto MVC yet model, that's 22217 21:13:39,411 --> 21:13:42,322 going to refer to things\nlike CSV files or databases. 22218 21:13:42,322 --> 21:13:46,116 The model, where do you keep\nactual data, typically long term. 22219 21:13:46,116 --> 21:13:47,991 So we'll come back to\nthat, but this picture 22220 21:13:47,991 --> 21:13:52,312 where you have one of these-- each of\n 22221 21:13:52,312 --> 21:13:55,222 another is representative of\nhow a lot of frameworks work. 22222 21:13:55,221 --> 21:13:58,792 What we're teaching today, this week,\n 22223 21:13:58,792 --> 21:14:01,581 It's not really specific to Flask,\n 22224 21:14:01,581 --> 21:14:03,471 It really is a very\ncommon paradigm that you 22225 21:14:03,471 --> 21:14:08,211 could implement in Java, C sharp, or\n 22226 21:14:08,211 --> 21:14:12,595 All right, so let's now\npivot back to VS Code here. 22227 21:14:12,595 --> 21:14:14,512 Let me stop running\nFlask, and let me go ahead 22228 21:14:14,512 --> 21:14:19,922 and create a new folder altogether\n 22229 21:14:19,922 --> 21:14:24,592 And let me go ahead and create\na folder called FroshIMS 22230 21:14:24,592 --> 21:14:27,862 representing freshman intramural\n 22231 21:14:29,422 --> 21:14:32,601 And now I'm going to code an app.py. 22232 21:14:32,601 --> 21:14:36,351 And in anticipation, I'm going to\n 22233 21:14:36,351 --> 21:14:38,251 This one in the FroshIMS folder. 22234 21:14:38,251 --> 21:14:42,351 And then in my templates directory,\n 22235 21:14:42,351 --> 21:14:44,482 and I'm just going to\nget myself started here. 22236 21:14:46,221 --> 21:14:48,741 I'm just copying my layout\nfrom earlier because most 22237 21:14:48,741 --> 21:14:53,031 of my interesting work, this time, is\n 22238 21:14:53,032 --> 21:14:54,512 So what is it we're creating? 22239 21:14:54,512 --> 21:14:58,282 So literally, the very first\nthing I wrote as a web application 22240 21:14:58,282 --> 21:15:02,042 20 years ago, was a site that\nliterally looked like this. 22241 21:15:02,042 --> 21:15:04,131 So I was like a sophomore\nor junior at the time. 22242 21:15:04,131 --> 21:15:06,982 I'd taken CS50 and a\nfollow-on class only. 22243 21:15:06,982 --> 21:15:08,889 I had no idea how to do web programming. 22244 21:15:08,888 --> 21:15:11,721 Neither of those two courses taught\n 22245 21:15:11,721 --> 21:15:14,391 So I taught myself, at the\ntime, a language called Perl. 22246 21:15:14,392 --> 21:15:18,000 And I learned a little something about\n 22247 21:15:18,000 --> 21:15:20,542 can't even say googled enough,\nbecause Google didn't come out 22248 21:15:22,142 --> 21:15:26,432 Read enough online to figure out how to\n 22249 21:15:26,432 --> 21:15:29,282 on campus, first years,\ncould actually register 22250 21:15:29,282 --> 21:15:32,162 via a website for intramural sports. 22251 21:15:32,161 --> 21:15:35,161 Back in my day, you would\nliterally fill out a piece of paper 22252 21:15:35,161 --> 21:15:38,491 and then walk it across the yard to\n 22253 21:15:38,491 --> 21:15:40,861 slide it under the dorm\nof the Proctor or RA 22254 21:15:40,861 --> 21:15:43,381 and thus you were\nregistered for sports so. 22255 21:15:46,532 --> 21:15:48,512 There was an internet,\njust wasn't really being 22256 21:15:48,512 --> 21:15:50,801 used much on campus or more generally. 22257 21:15:50,801 --> 21:15:54,301 So background images\nthat repeat infinitely 22258 21:15:54,301 --> 21:15:56,611 was in vogue, apparently, at the time. 22259 21:15:56,611 --> 21:15:58,951 All of this was like images\nthat I had to hand make 22260 21:15:58,952 --> 21:16:03,881 because we did not have the features\n 22261 21:16:03,881 --> 21:16:07,801 So it was really just HTML, and it was\n 22262 21:16:09,512 --> 21:16:12,301 And it was really just\nthe same building blocks 22263 21:16:12,301 --> 21:16:15,131 that we hear already today now have. 22264 21:16:15,131 --> 21:16:17,702 So we'll get rid of all of\nthe imagery and focus more 22265 21:16:17,702 --> 21:16:19,952 on the functionality and the\naesthetics, but let's see 22266 21:16:19,952 --> 21:16:23,252 if we can whip up a web\napplication via which someone could 22267 21:16:23,251 --> 21:16:26,281 register for one such intramural sport. 22268 21:16:26,282 --> 21:16:29,972 So in app.py, me go ahead and\nimport some familiar things now. 22269 21:16:29,971 --> 21:16:33,481 From Flask, let's import\ncapital Flask, which 22270 21:16:33,482 --> 21:16:36,482 is that function we need to kick\n 22271 21:16:36,482 --> 21:16:39,661 Render templates, so we have the\n 22272 21:16:39,661 --> 21:16:42,122 those templates, and request\nso that we have the ability 22273 21:16:42,122 --> 21:16:44,850 to get at input from the human. 22274 21:16:44,850 --> 21:16:46,892 Let me go ahead and create\nthe application itself 22275 21:16:46,892 --> 21:16:49,302 using this magical incantation here. 22276 21:16:49,301 --> 21:16:56,141 And then let's go ahead and define a\n 22277 21:16:56,142 --> 21:16:58,022 I'm going to define a\nfunction called index. 22278 21:16:58,021 --> 21:17:00,661 But just to be clear, this\nfunction could be anything. 22279 21:17:00,661 --> 21:17:03,221 Foo, bar, baz, anything else. 22280 21:17:03,221 --> 21:17:05,281 But I tend to name\nthem in a manner that's 22281 21:17:05,282 --> 21:17:07,022 consistent with what\nthe route is called. 22282 21:17:07,021 --> 21:17:09,061 But you could call it\nanything you want, it's 22283 21:17:09,062 --> 21:17:12,601 just the function that will get\n 22284 21:17:12,601 --> 21:17:14,851 Now, let me go ahead here\nand just get things started. 22285 21:17:14,851 --> 21:17:18,294 Return, render template of index.html. 22286 21:17:18,294 --> 21:17:19,711 Just keep it simple, nothing more. 22287 21:17:19,711 --> 21:17:22,622 So there's nothing really\nFroshIM specific about this here 22288 21:17:22,622 --> 21:17:25,172 I just want to make sure I'm\ndoing everything correctly. 22289 21:17:25,172 --> 21:17:27,152 Meanwhile, I've got my layout. 22290 21:17:27,152 --> 21:17:31,562 OK, let me go ahead, and in my\ntemplates directory, code a file 22291 21:17:33,751 --> 21:17:40,051 And let's just do extends\nlayout.html at the top 22292 21:17:40,051 --> 21:17:42,152 just so that we get\nbenefit from that template. 22293 21:17:42,152 --> 21:17:44,012 And down here, I'm just\ngoing to say to do. 22294 21:17:44,012 --> 21:17:47,012 Just so that I have something\ngoing on visually to make sure 22295 21:17:48,452 --> 21:17:51,782 In my FroshIMS directory,\nlet me do Flask run. 22296 21:17:51,782 --> 21:17:54,961 Let me now go back to my previous URL,\n 22297 21:17:54,961 --> 21:17:59,611 But now, I'm serving\nup the FroshIM site. 22298 21:18:01,351 --> 21:18:04,622 That's because I\nscrewed up accidentally. 22299 21:18:04,622 --> 21:18:07,851 What did I do wrong in index.html? 22300 21:18:13,361 --> 21:18:16,229 This file extends layout.html, but-- 22301 21:18:16,229 --> 21:18:17,771 AUDIENCE: You left out the block tag? 22302 21:18:18,271 --> 21:18:22,661 I forgot to tell Flask what\nto plug into that layout. 22303 21:18:22,661 --> 21:18:26,801 So I just need to say block body, and\n 22304 21:18:26,801 --> 21:18:28,841 or whatever I want to\neventually get around to. 22305 21:18:31,751 --> 21:18:34,091 OK, so now it looks ugly, more cryptic. 22306 21:18:34,092 --> 21:18:36,972 But this is, again, the\nessence of doing templating. 22307 21:18:36,971 --> 21:18:41,101 Let me now restart Flask up\nhere, let me go back to the page. 22308 21:18:41,733 --> 21:18:43,691 Crossing my fingers this\ntime, and there we go. 22309 21:18:44,081 --> 21:18:46,241 So it's not the application\nI want, but at least I 22310 21:18:46,241 --> 21:18:48,630 know I have some of the\nplumbing there by default. 22311 21:18:48,630 --> 21:18:50,922 All right, so if I want the\nuser to be able to register 22312 21:18:50,922 --> 21:18:53,112 for one of these sports,\nlet's enhance, now 22313 21:18:53,111 --> 21:18:56,051 index.html to actually\nhave a form that's 22314 21:18:56,051 --> 21:18:59,811 maybe got a dropdown menu for all of\n 22315 21:18:59,812 --> 21:19:01,991 So let me go into this template here. 22316 21:19:01,991 --> 21:19:05,171 And instead of to do, let's\ngo ahead and give myself 22317 21:19:05,172 --> 21:19:08,922 how about an H1 tag that just says\n 22318 21:19:09,732 --> 21:19:13,032 How about a form tag\nthat's going to use POST 22319 21:19:13,032 --> 21:19:16,211 just because it's not really necessary\n 22320 21:19:17,292 --> 21:19:20,021 The action for that, how\nabout we plan to create 22321 21:19:20,021 --> 21:19:24,971 a register route so that we're sending\n 22322 21:19:24,971 --> 21:19:26,531 So we'll have to come back to that. 22323 21:19:26,532 --> 21:19:32,412 In here, let me go ahead and create,\n 22324 21:19:35,782 --> 21:19:37,824 How about a name equals\nname, because I'm 22325 21:19:37,823 --> 21:19:40,781 going to ask the student for their\n 22326 21:19:41,629 --> 21:19:43,211 And the type of this box will be text. 22327 21:19:43,211 --> 21:19:45,372 So this is pretty much\nidentical to before. 22328 21:19:45,372 --> 21:19:48,822 But if you've not seen this\nyet, let's create a select menu 22329 21:19:48,822 --> 21:19:51,042 a so-called dropdown menu in HTML. 22330 21:19:51,042 --> 21:19:54,551 And maybe the first option\nI want to be in there 22331 21:19:54,551 --> 21:19:58,331 is going to be, oh, how\nabout the current three 22332 21:19:58,331 --> 21:20:05,322 sports for the fall, which are\nbasketball, and another option 22333 21:20:05,322 --> 21:20:07,902 is going to be soccer,\nand a third option is 22334 21:20:07,902 --> 21:20:13,461 going to be ultimate frisbee for\n 22335 21:20:13,461 --> 21:20:14,922 So I've got those three options. 22336 21:20:16,062 --> 21:20:20,412 I haven't implemented my route yet,\n 22337 21:20:20,411 --> 21:20:23,682 to go back now and check\nif my form has reloaded. 22338 21:20:23,682 --> 21:20:26,419 So let me go ahead and\nstop and start Flask. 22339 21:20:26,419 --> 21:20:28,961 You'll see there's ways to\nautomate the process of restarting 22340 21:20:28,961 --> 21:20:31,122 the server that we'll do for\nyou for problem set nine 22341 21:20:31,122 --> 21:20:32,832 so you don't have to\nkeep stopping Flask. 22342 21:20:32,831 --> 21:20:36,822 Let me reload my index route\nand OK, it's not that pretty. 22343 21:20:39,762 --> 21:20:41,832 But it now has at least\nsome functionality 22344 21:20:41,831 --> 21:20:44,921 where I can type in my name\nand then type in the sport. 22345 21:20:44,922 --> 21:20:47,502 Now, I might be biasing\npeople toward basketball. 22346 21:20:47,501 --> 21:20:51,971 Like UX wise, user experience\nwise, it's obnoxious to precheck 22347 21:20:51,971 --> 21:20:53,572 basketball but not the others. 22348 21:20:53,572 --> 21:20:55,572 So there's some little\ntweaks we can make there. 22349 21:20:55,572 --> 21:20:57,822 Let me go back into index.html. 22350 21:20:57,822 --> 21:21:04,110 Let me create an empty option up here\n 22351 21:21:04,110 --> 21:21:05,652 going to have the name of any sports. 22352 21:21:05,652 --> 21:21:08,110 But it's just going to have a\nword I want the human to see 22353 21:21:08,110 --> 21:21:12,622 so I'm actually going to disable this\n 22354 21:21:12,622 --> 21:21:15,012 But I'm going to say sport up here. 22355 21:21:15,012 --> 21:21:18,491 And there's different ways to do this,\n 22356 21:21:22,422 --> 21:21:24,762 Creating a placeholder\nsports so that the user 22357 21:21:24,762 --> 21:21:26,862 sees something in the dropdown. 22358 21:21:26,861 --> 21:21:29,932 Let me go ahead and restart\nFlask, reload the page 22359 21:21:29,932 --> 21:21:31,932 and now it's just going\nto be marginally better. 22360 21:21:31,932 --> 21:21:34,122 Now you see sport that's\nchecked by default 22361 21:21:34,122 --> 21:21:37,092 but you have to check one of\nthese other ones ultimately. 22362 21:21:37,092 --> 21:21:38,482 All right, so that's pretty good. 22363 21:21:38,482 --> 21:21:40,692 So let me now type in David. 22364 21:21:40,691 --> 21:21:43,121 I'll register for ultimate frisbee. 22365 21:21:43,122 --> 21:21:46,512 OK, I definitely forgot something. 22366 21:21:48,702 --> 21:21:52,782 All right, so input type equals submit. 22367 21:21:52,782 --> 21:21:54,102 All right, let's put that in. 22368 21:21:57,544 --> 21:21:58,961 Submit could be a little prettier. 22369 21:21:58,961 --> 21:22:03,491 Recall that we can change some of\n 22370 21:22:03,491 --> 21:22:05,652 The value of this button\nshould be register, maybe 22371 21:22:05,652 --> 21:22:07,241 just to make things a little prettier. 22372 21:22:07,241 --> 21:22:10,456 Let me now reload the page and register. 22373 21:22:10,456 --> 21:22:13,331 All right, so now we really have\n 22374 21:22:13,331 --> 21:22:17,741 that I created some years ago to let\n 22375 21:22:17,741 --> 21:22:21,641 So let's go, now, and create maybe\n 22376 21:22:22,691 --> 21:22:25,686 And in here, if we want to\nallow the user to register 22377 21:22:25,687 --> 21:22:28,812 let's do a little bit of error checking\n 22378 21:22:28,812 --> 21:22:30,612 What could the user do wrong? 22379 21:22:30,611 --> 21:22:32,801 Because assume that they will. 22380 21:22:32,801 --> 21:22:34,631 One, they might not type their name. 22381 21:22:34,631 --> 21:22:36,505 Two, they might not choose a sport. 22382 21:22:36,505 --> 21:22:38,172 So they might just submit an empty form. 22383 21:22:38,172 --> 21:22:40,302 So that's two things we\ncould check for, just 22384 21:22:40,301 --> 21:22:43,721 so that we're not scoring bogus\n 22385 21:22:43,721 --> 21:22:47,351 So let's create another\nroute called greet, /greet. 22386 21:22:47,351 --> 21:22:50,501 And then in this route, let's\ncreate a function called greet 22387 21:22:50,501 --> 21:22:52,661 but can be called anything we want. 22388 21:22:52,661 --> 21:22:56,054 And then let's go ahead, and in\n 22389 21:22:56,054 --> 21:22:57,221 and validate the submission. 22390 21:22:57,221 --> 21:22:59,351 So a little comment to myself here. 22391 21:22:59,351 --> 21:23:08,031 How about if there is not a\nrequest.form GET name value 22392 21:23:08,032 --> 21:23:10,211 so that is if that\nfunction returns nothing 22393 21:23:10,211 --> 21:23:14,062 like quote unquote, or the\nspecial word none in Python. 22394 21:23:14,062 --> 21:23:25,042 Or request.form.get"sport" not\nin quote unquote, what were they? 22395 21:23:25,042 --> 21:23:32,122 Basketball, the other one was soccer,\n 22396 21:23:32,122 --> 21:23:35,961 Getting a little long, but notice\n 22397 21:23:35,961 --> 21:23:39,051 If the user did not\ngive us a name, that is 22398 21:23:39,051 --> 21:23:41,671 if this function returns\nthe equivalent of false 22399 21:23:41,672 --> 21:23:45,592 which is, quote unquote, or literally\n 22400 21:23:45,592 --> 21:23:52,252 Or if the sport the user provided is\n 22401 21:23:52,251 --> 21:23:56,061 or ultimate frisbee, which I've defined\n 22402 21:23:56,062 --> 21:23:57,652 and just yell at the user in some way. 22403 21:23:57,652 --> 21:24:02,961 Let's return render\ntemplate of failure.html. 22404 21:24:02,961 --> 21:24:06,211 And that's just going to be some\n 22405 21:24:06,211 --> 21:24:08,751 Otherwise, if they get\nthis far, let's go ahead 22406 21:24:08,751 --> 21:24:12,111 and confirm registration\nby just returning-- whoops 22407 21:24:12,111 --> 21:24:18,182 returning render template quote\nunquote "success" dot HTML. 22408 21:24:18,182 --> 21:24:20,461 All right, so a couple\nquick things to do. 22409 21:24:20,461 --> 21:24:25,111 Let me first go in and in\nmy templates directory 22410 21:24:25,111 --> 21:24:28,291 let's create this failure.html file. 22411 21:24:28,292 --> 21:24:31,191 And this is just meant to\nbe a message to the user 22412 21:24:31,191 --> 21:24:34,262 that they fail to provide\nthe information correctly. 22413 21:24:34,262 --> 21:24:37,042 So let me go ahead and in failure.html. 22414 21:24:40,012 --> 21:24:44,902 So let me extend layout.html and in\n 22415 21:24:44,902 --> 21:24:47,902 I'll just yell at them like that so\n 22416 21:24:47,902 --> 21:24:52,521 And then let me create one other\nfile called success.html, that 22417 21:24:52,521 --> 21:24:54,904 similarly is mostly just Jinja syntax. 22418 21:24:54,904 --> 21:24:57,322 And I'm just going to say for\nnow, even though they're not 22419 21:24:57,322 --> 21:24:59,902 technically registered in any\ndatabase, you are registered. 22420 21:24:59,902 --> 21:25:02,032 That's what we mean by success. 22421 21:25:02,032 --> 21:25:05,211 All right, so let me go ahead,\nand back in my FroshIMS 22422 21:25:07,131 --> 21:25:09,322 Let me go back to the form and reload. 22423 21:25:10,911 --> 21:25:13,881 All right, so now let me\nnot cooperate and just 22424 21:25:13,881 --> 21:25:17,061 immediately click Register impatiently. 22425 21:25:20,721 --> 21:25:24,769 Register-- oh, I'm\nconfusing our two examples. 22426 21:25:24,770 --> 21:25:26,062 All right, I spotted the error. 22427 21:25:32,241 --> 21:25:35,812 There's where I am, what did\nI actually invent over here? 22428 21:25:44,562 --> 21:25:45,812 AUDIENCE: Register, not greet. 22429 21:25:47,611 --> 21:25:50,731 I had last example on my mind,\nso the route should be register. 22430 21:25:50,732 --> 21:25:54,122 Ironically, the function could be greet,\n 22431 21:25:54,122 --> 21:25:57,792 But to keep ourselves sane, let's\n 22432 21:25:57,792 --> 21:26:00,218 Let me go ahead now and\nstart Flask as intended. 22433 21:26:00,218 --> 21:26:02,551 Let me reload the form just\nto make sure all is working. 22434 21:26:02,551 --> 21:26:06,902 Now, let me not cooperate and be\na bad user, clicking register-- 22435 21:26:07,982 --> 21:26:10,652 OK, other unintended mistake. 22436 21:26:10,652 --> 21:26:12,881 But this one we've seen before. 22437 21:26:12,881 --> 21:26:15,721 Notice that by default,\nroute only support GET. 22438 21:26:15,721 --> 21:26:18,721 So if I want to\nspecifically support POST 22439 21:26:18,721 --> 21:26:25,471 I have to pass in, by a methods\n 22440 21:26:25,471 --> 21:26:28,351 methods that could be\nGET comma POST, but if I 22441 21:26:28,351 --> 21:26:32,221 don't have no need for a GET in\n 22442 21:26:32,221 --> 21:26:34,741 All right, now let's\ndo this one last time. 22443 21:26:34,741 --> 21:26:37,441 Reload the form to make sure\neverything's OK, click Register 22444 21:26:37,441 --> 21:26:39,194 and you are not registered. 22445 21:26:40,111 --> 21:26:42,528 All right, let me go ahead and\nat least give them my name. 22446 21:26:44,312 --> 21:26:49,662 Fine, I'm going to go ahead and be\n 22447 21:26:53,402 --> 21:26:58,542 What should I-- what\ndid I mean to do here? 22448 21:26:58,542 --> 21:27:00,572 All right, so let's figure this out. 22449 21:27:00,572 --> 21:27:04,711 How to debug something like this,\n 22450 21:27:07,081 --> 21:27:10,351 How can we go about\ntroubleshooting this? 22451 21:27:10,351 --> 21:27:13,202 Turn this into the teachable moment. 22452 21:27:13,202 --> 21:27:15,782 All right, well first,\nsome safety checks. 22453 21:27:17,372 --> 21:27:20,521 Let me go ahead and view page\nsource, a good rule of thumb. 22454 21:27:20,521 --> 21:27:23,081 Look at the HTML that you\nactually sent to the user. 22455 21:27:23,081 --> 21:27:27,251 So here, I have an\ninput with a name name. 22456 21:27:27,251 --> 21:27:29,711 So that's what I\nintended, that looks OK. 22457 21:27:29,711 --> 21:27:32,551 Ah, I see it already, even\nthough you, if you've never 22458 21:27:32,551 --> 21:27:35,792 used a select menu, you might\nnot know what, apparently 22459 21:27:35,792 --> 21:27:41,911 is missing from here that I\ndid have for my text input. 22460 21:27:41,911 --> 21:27:45,466 Just intuitively, logically. 22461 21:27:45,467 --> 21:27:47,342 What's going through my\nhead, embarrassingly 22462 21:27:47,342 --> 21:27:51,692 is, all right, if my form thinks\n 22463 21:27:51,691 --> 21:27:55,471 how did I create a situation in which\n 22464 21:27:55,471 --> 21:27:57,481 Well, name, I don't think\nit's going to be blank 22465 21:27:57,482 --> 21:28:01,262 because I explicitly gave\nthis text field a name name 22466 21:28:01,262 --> 21:28:02,972 and that did work last time. 22467 21:28:02,971 --> 21:28:06,421 I've now given a second input\nin the form of the select menu. 22468 21:28:06,422 --> 21:28:13,112 But what seems to be missing here\nthat I'm assuming exists here? 22469 21:28:13,111 --> 21:28:16,381 It's just a dumb mistake I made. 22470 21:28:16,381 --> 21:28:19,448 What might be missing here? 22471 21:28:19,448 --> 21:28:23,552 If request.form gives you all of\nthe inputs that the user might 22472 21:28:23,551 --> 21:28:26,551 have typed in, let me\ngo into my actual code 22473 21:28:26,551 --> 21:28:30,451 here in my form and name equals sport. 22474 21:28:30,452 --> 21:28:32,472 I just didn't give a name to that input. 22475 21:28:32,471 --> 21:28:34,651 So it exists, and the\nbrowser doesn't care. 22476 21:28:34,652 --> 21:28:36,485 It's still going to\ndisplay the form to you 22477 21:28:36,485 --> 21:28:40,211 it just hasn't given it a unique name\n 22478 21:28:40,211 --> 21:28:42,551 So now, if I'm not going\nto put my foot in my mouth 22479 21:28:42,551 --> 21:28:44,741 I think that's what I did wrong. 22480 21:28:44,741 --> 21:28:46,711 And again, my process\nfor figuring that out 22481 21:28:46,711 --> 21:28:49,043 was looking at my code,\nthinking through logically 22482 21:28:49,043 --> 21:28:50,251 is this right, is this right? 22483 21:28:50,251 --> 21:28:52,421 No, I was missing the name there. 22484 21:28:52,422 --> 21:28:55,382 So let's run Flask,\nlet's reload the form 22485 21:28:55,381 --> 21:28:59,941 just to make sure it's all defaults\n 22486 21:28:59,941 --> 21:29:04,441 in ultimate frisbee, crossing\nmy fingers extra hard this time. 22487 21:29:06,932 --> 21:29:08,664 I did not intend to\nscrew up in that way 22488 21:29:08,664 --> 21:29:10,831 but that's exactly the right\nkind of thought process 22489 21:29:10,831 --> 21:29:12,152 to diagnose issues like this. 22490 21:29:12,152 --> 21:29:16,021 Go back to the basics, go back to what\n 22491 21:29:16,021 --> 21:29:18,432 and just rule things in and out. 22492 21:29:18,432 --> 21:29:21,182 There's only a finite number of\n 22493 21:29:22,054 --> 21:29:23,410 AUDIENCE: Are you [INAUDIBLE]. 22494 21:29:25,581 --> 21:29:27,081 DAVID: Excuse-- say a little louder? 22495 21:29:27,081 --> 21:29:30,981 AUDIENCE: I don't understand why\nname equals sport [INAUDIBLE].. 22496 21:29:30,982 --> 21:29:33,631 DAVID: Why did name equal\nsport address the problem? 22497 21:29:33,631 --> 21:29:35,872 Well, let's first go back to the HTML. 22498 21:29:35,872 --> 21:29:43,172 Previously, it was just the reality that\n 22499 21:29:44,601 --> 21:29:47,932 But names, or more\ngenerally, key value pairs 22500 21:29:47,932 --> 21:29:51,211 is how information is sent\nfrom a form to the server. 22501 21:29:51,211 --> 21:29:56,422 So if there's no name, there's no key to\n 22502 21:29:56,422 --> 21:30:00,082 It would be like nothing equals ultimate\n 22503 21:30:00,081 --> 21:30:02,432 The browser is just\nnot going to send it. 22504 21:30:02,432 --> 21:30:08,331 However, in app.py, I was naively\n 22505 21:30:08,331 --> 21:30:11,061 there would be a name called\nquote unquote "sport. 22506 21:30:11,062 --> 21:30:13,702 It could have been anything,\nbut I was assuming it was sport. 22507 21:30:13,702 --> 21:30:15,532 But I never told the form that. 22508 21:30:15,532 --> 21:30:19,022 And if I really wanted to dig in,\n 22509 21:30:19,021 --> 21:30:22,012 Let me go back to the\nway it was a moment ago. 22510 21:30:22,012 --> 21:30:25,801 Let me get rid of the name\nof the sport dropdown menu. 22511 21:30:25,801 --> 21:30:30,981 Let me rerun Flask down here\nand reload the form itself 22512 21:30:30,982 --> 21:30:33,411 after it finishes being served. 22513 21:30:34,581 --> 21:30:39,277 View Developer Tools, and then let me\n 22514 21:30:39,277 --> 21:30:41,152 we played around with\na little bit last week. 22515 21:30:41,152 --> 21:30:44,334 And we also played around with Curl,\n 22516 21:30:44,334 --> 21:30:46,042 Here's another-- here's\nwhat I would have 22517 21:30:46,042 --> 21:30:49,411 done if I still wasn't seeing the error\n 22518 21:30:49,411 --> 21:30:53,512 I would have typed in my name as before,\n 22519 21:30:53,512 --> 21:30:55,252 I would have clicked register. 22520 21:30:55,251 --> 21:30:59,241 And now, I would have\nlooked at the HTTP request. 22521 21:30:59,241 --> 21:31:01,342 And I would click on Register here. 22522 21:31:01,342 --> 21:31:04,942 And just like we did last week, I\n 22523 21:31:04,941 --> 21:31:07,671 And there's a whole lot of stuff\nthat we can typically ignore. 22524 21:31:07,672 --> 21:31:10,792 But here, let me zoom\nin, way at the bottom 22525 21:31:10,792 --> 21:31:13,131 what Chrome's developer\ntools are doing for me 22526 21:31:13,131 --> 21:31:16,141 it's showing me all of the\nform data that was submitted. 22527 21:31:16,142 --> 21:31:18,712 So this really would have\nbeen my telltale clue. 22528 21:31:18,711 --> 21:31:21,982 I'm just not sending the sport,\neven if the human typed it in. 22529 21:31:21,982 --> 21:31:23,991 And logically, because\nI've done this before 22530 21:31:23,991 --> 21:31:26,932 that must mean I didn't\ngive the thing a name. 22531 21:31:28,101 --> 21:31:31,372 Like good programmers, web developers\n 22532 21:31:32,422 --> 21:31:34,160 They're not writing bug-free code. 22533 21:31:34,160 --> 21:31:35,452 That's not the point to get to. 22534 21:31:35,452 --> 21:31:38,542 The point to get to is\nbeing a good diagnostician 22535 21:31:38,542 --> 21:31:40,551 I would say, in these cases. 22536 21:31:46,812 --> 21:31:52,152 AUDIENCE: What if you want to\n 22537 21:31:52,152 --> 21:31:54,444 DAVID: I'm sorry, a little bit louder? 22538 21:31:54,444 --> 21:31:57,012 AUDIENCE: If you want to\nedit in CSS or anything 22539 21:31:57,012 --> 21:32:01,422 in HTML, once you have to fix\nthe template, how do you that? 22540 21:32:01,422 --> 21:32:05,532 DAVID: So how would you edit\nCSS if you have these templates? 22541 21:32:05,532 --> 21:32:07,842 That process we'll\nactually see before long. 22542 21:32:07,842 --> 21:32:09,467 It's almost going to be the exact same. 22543 21:32:09,467 --> 21:32:12,550 Just to give you a teaser for this,\n 22544 21:32:12,550 --> 21:32:15,402 but we'll give you some distribution\n 22545 21:32:15,402 --> 21:32:17,862 You can absolutely still\ndo something like this. 22546 21:32:17,861 --> 21:32:22,660 Link href equals quote\nunquote "styles" dot 22547 21:32:22,660 --> 21:32:27,370 CSS rel equals style sheet, that's one\n 22548 21:32:27,370 --> 21:32:31,422 The only difference today, using Flask,\n 22549 21:32:31,422 --> 21:32:34,041 by convention, should go\nin your static folder. 22550 21:32:34,040 --> 21:32:36,072 So the change you would\nmake in your layout 22551 21:32:36,072 --> 21:32:40,001 would be to say that styles dot\nCSS is in your static folder. 22552 21:32:40,001 --> 21:32:43,990 And then, if I go into\nmy FroshIMS directory 22553 21:32:43,990 --> 21:32:46,330 I can create a static folder. 22554 21:32:46,331 --> 21:32:48,822 I can CD into it,\nnothing's there by default. 22555 21:32:48,822 --> 21:32:51,762 But if I now code a\nfile called styles.css 22556 21:32:51,762 --> 21:32:54,822 I could now do something like this body. 22557 21:32:54,822 --> 21:33:05,982 And in here, I could say background\n 22558 21:33:05,982 --> 21:33:10,211 Let me go ahead now and restart\nFlask in the FroshIMS directory. 22559 21:33:10,211 --> 21:33:12,822 Cross my fingers because\nI'm doing this on the fly. 22560 21:33:12,822 --> 21:33:16,161 Go back to my form and reload. 22561 21:33:16,160 --> 21:33:19,751 Voila, now we've tied together\nlast week's stuff as well. 22562 21:33:19,751 --> 21:33:22,853 If I answered the right question? 22563 21:33:22,854 --> 21:33:27,202 AUDIENCE: [INAUDIBLE] change\none page and not the other. 22564 21:33:27,202 --> 21:33:30,202 DAVID: If you want to change one page\n 22565 21:33:32,122 --> 21:33:36,952 In that case, you might want to have\n 22566 21:33:38,152 --> 21:33:42,381 You could use different classes in one\n 22567 21:33:42,381 --> 21:33:43,801 There's different ways to do that. 22568 21:33:43,801 --> 21:33:47,751 You could even have a\nplaceholder in your layout 22569 21:33:47,751 --> 21:33:52,221 that allows you to plug in\nthe URL of a specific style 22570 21:33:52,221 --> 21:33:53,661 sheet in your individual files. 22571 21:33:53,661 --> 21:33:56,432 But that starts to get\nmore complicated quickly. 22572 21:33:56,432 --> 21:33:58,172 So in short, you can absolutely do it. 22573 21:33:58,172 --> 21:34:01,942 But typically, I would\nsay most websites try not 22574 21:34:01,941 --> 21:34:03,652 to use different style Sheets per page. 22575 21:34:03,652 --> 21:34:06,322 They reuse the styles\nas much as they can. 22576 21:34:06,322 --> 21:34:08,572 All right, let me go ahead\nand revert this real quick. 22577 21:34:08,572 --> 21:34:11,512 And let's start to add a little\nbit more functionality here. 22578 21:34:11,512 --> 21:34:14,392 I'm going to go ahead and just\n 22579 21:34:14,392 --> 21:34:16,162 to not complicate things just yet. 22580 21:34:16,161 --> 21:34:19,251 And let's go ahead and just play\n 22581 21:34:19,851 --> 21:34:23,264 In my form here, the dropdown\nmenu is perfectly fine. 22582 21:34:24,182 --> 21:34:27,411 But suppose that I wanted to\nchange it to checkboxes instead. 22583 21:34:27,411 --> 21:34:31,432 Maybe I want students to be able to\n 22584 21:34:31,432 --> 21:34:34,822 Well, it might make sense to\nclean this up in a couple of ways. 22585 21:34:35,572 --> 21:34:40,762 Before we even get into the checkboxes,\n 22586 21:34:40,762 --> 21:34:45,211 Notice that I've hardcoded basketball,\n 22587 21:34:45,211 --> 21:34:49,592 And if you recall, in app.py, I also\n 22588 21:34:49,592 --> 21:34:52,932 And any time you see copy paste\nor the equivalent thereof 22589 21:34:52,932 --> 21:34:54,601 feels like we could do better. 22590 21:34:54,601 --> 21:34:56,402 So what if I instead do this. 22591 21:34:56,402 --> 21:35:00,711 What if I instead give myself\na global variable of Sports 22592 21:35:00,711 --> 21:35:03,111 I'll capitalize the word\njust to connote that it's 22593 21:35:03,111 --> 21:35:07,161 meant to be constant even though\n 22594 21:35:07,161 --> 21:35:09,741 The first sport will be basketball. 22595 21:35:11,721 --> 21:35:15,921 The third will be ultimate frisbee. 22596 21:35:15,922 --> 21:35:19,942 Now I have one convenient\nplace to store all of my sports 22597 21:35:19,941 --> 21:35:22,461 if it changes next semester\nor next year or whatnot. 22598 21:35:22,461 --> 21:35:24,471 But notice what I could do to. 22599 21:35:24,471 --> 21:35:26,101 I could now do something like this. 22600 21:35:26,101 --> 21:35:30,111 Let me pass into my\nindex template a variable 22601 21:35:30,111 --> 21:35:34,461 called sports that's equal to\nthat global variable sports. 22602 21:35:34,461 --> 21:35:37,792 Let me go into my index now,\nand this is really, now 22603 21:35:37,792 --> 21:35:41,782 going to hint at the power of\n 22604 21:35:41,782 --> 21:35:45,452 Let me go ahead and get rid of all\n 22605 21:35:45,452 --> 21:35:50,332 and let me show you some slightly\n 22606 21:35:52,851 --> 21:35:54,872 We've not seen this end for syntax. 22607 21:35:54,872 --> 21:35:57,692 There's like end block syntax,\nbut it's as simple as that. 22608 21:35:57,691 --> 21:36:00,951 So you have a start and an end to your\n 22609 21:36:02,392 --> 21:36:08,482 Option curly brace\nsport close curly brace. 22610 21:36:09,831 --> 21:36:12,981 Let me go back into my\nterminal window, do Flask run. 22611 21:36:12,982 --> 21:36:16,161 And if I didn't mess up\nhere, let me go back to this. 22612 21:36:16,161 --> 21:36:18,801 The red's going to go away\nbecause I deleted my CSS. 22613 21:36:18,801 --> 21:36:21,831 And now I still have a sport\ndropdown and all of those sports 22614 21:36:22,771 --> 21:36:24,351 I can make one more improvement now. 22615 21:36:24,351 --> 21:36:27,301 I don't need to mention these\nsame sports manually in app.py. 22616 21:36:27,301 --> 21:36:31,671 I can now just say if the\nuser's inputed sport is not 22617 21:36:31,672 --> 21:36:35,070 in my global variable, sports,\nand ask the same question. 22618 21:36:35,070 --> 21:36:36,862 And this is really\nhandy because if there's 22619 21:36:36,861 --> 21:36:41,001 another sport, for instance, that\ngets added, like say football 22620 21:36:41,001 --> 21:36:43,921 all I have to do is\nchange my global variable. 22621 21:36:43,922 --> 21:36:47,542 And if I reload the form now\nand look in the dropdown, boom 22622 21:36:47,542 --> 21:36:49,732 now I have support for a fourth sport. 22623 21:36:49,732 --> 21:36:51,422 And I can keep adding and adding there. 22624 21:36:51,422 --> 21:36:55,012 So here's where templating starts\nto get really powerful in that 22625 21:36:55,012 --> 21:37:00,472 now, in this template, I'm using\nJinja's for loop syntax, which 22626 21:37:00,471 --> 21:37:02,961 is almost identical to\nPython here, except you 22627 21:37:02,961 --> 21:37:06,262 need the curly brace and the percent\n 22628 21:37:07,131 --> 21:37:08,811 But it's the same idea as in Python. 22629 21:37:08,812 --> 21:37:13,072 Iterating over something with a for loop\n 22630 21:37:13,072 --> 21:37:14,991 And this is like every\nwebsite out there. 22631 21:37:15,861 --> 21:37:19,851 When you visit your inbox and you\n 22632 21:37:19,851 --> 21:37:22,491 Google has not hardcoded\nyour emails manually. 22633 21:37:22,491 --> 21:37:24,322 They have grabbed them from a database. 22634 21:37:24,322 --> 21:37:26,072 They have some kind\nof for loop like this 22635 21:37:26,072 --> 21:37:32,057 and are just outputting table row after\n 22636 21:37:32,057 --> 21:37:34,432 All right, so now, let's go\nahead and change this, maybe 22637 21:37:34,432 --> 21:37:40,232 to, oh, how about little\ncheckboxes or radio buttons. 22638 21:37:40,232 --> 21:37:41,572 So let me go ahead and do this. 22639 21:37:41,572 --> 21:37:46,432 Instead of a select menu, I'm going to\n 22640 21:37:46,432 --> 21:37:52,351 For each of these sports let me go\n 22641 21:37:52,351 --> 21:37:55,281 but let me go ahead and\noutput an input tag 22642 21:37:55,282 --> 21:37:59,302 the name for which is quote\nunquote "sport," the type of which 22643 21:37:59,301 --> 21:38:05,001 is checkbox, the value of which is\n 22644 21:38:05,001 --> 21:38:09,403 quote unquote, and then afterward\n 22645 21:38:10,111 --> 21:38:11,911 So you see a word next to the checkbox. 22646 21:38:11,911 --> 21:38:14,161 And we'll look at the result\nof this in just a moment. 22647 21:38:14,161 --> 21:38:17,721 So it's actually a little simpler\n 22648 21:38:17,721 --> 21:38:20,961 because now watch what\nhappens if I reload my form. 22649 21:38:20,961 --> 21:38:24,322 Different user interface,\nand it's not as pretty 22650 21:38:24,322 --> 21:38:27,601 but it's going to allow users to sign\n 22651 21:38:28,351 --> 21:38:31,682 Now I can click on basketball\nand football and soccer 22652 21:38:31,682 --> 21:38:34,021 or some other combination thereof. 22653 21:38:34,021 --> 21:38:37,801 If I view the page's source, this\n 22654 21:38:37,801 --> 21:38:41,981 I didn't have to type out four\n 22655 21:38:41,982 --> 21:38:45,302 And these things all have\nthe same name, but that's OK. 22656 21:38:45,301 --> 21:38:48,932 It turns out with Flask, if it sees\n 22657 21:38:48,932 --> 21:38:52,922 it's going to hand them back to you as\n 22658 21:38:52,922 --> 21:38:56,192 All right, but suppose we don't want\n 22659 21:38:57,572 --> 21:39:01,202 Let me go ahead and change this\ncheckbox to radio button, which 22660 21:39:01,202 --> 21:39:03,162 a radio button is mutually exclusive. 22661 21:39:03,161 --> 21:39:04,891 So you can only sign up for one. 22662 21:39:04,892 --> 21:39:09,542 So now, once I reload\nthe page, there we go. 22663 21:39:12,001 --> 21:39:16,771 And because I've given each of these\n 22664 21:39:16,771 --> 21:39:20,171 sport," that\'s what makes\nthem mutually exclusive. 22665 21:39:20,172 --> 21:39:23,492 The browser knows all four of\nthese things are types of sports 22666 21:39:23,491 --> 21:39:26,732 therefore I'm only going to let\nyou select one of these things. 22667 21:39:26,732 --> 21:39:29,072 And that's simply because\nthey all have the same name. 22668 21:39:29,072 --> 21:39:32,491 Again, if I view page source, notice\n 22669 21:39:32,491 --> 21:39:36,542 name equals sport, name equals\n 22670 21:39:36,542 --> 21:39:39,703 that each one is going to have. 22671 21:39:39,703 --> 21:39:45,251 All right, any questions,\nthen, on this approach? 22672 21:39:45,751 --> 21:39:47,751 Well, let me go ahead and\nopen a version of this 22673 21:39:47,751 --> 21:39:51,451 that I made in advance that's going\n 22674 21:39:51,452 --> 21:39:53,342 So thus far, we're\nnot quite at the point 22675 21:39:53,342 --> 21:39:56,672 of where this website was, which\n 22676 21:39:56,672 --> 21:39:59,192 like in a database, everyone\nwho had registered for sports. 22677 21:39:59,191 --> 21:40:01,891 Now, we're literally telling\nstudents you are registered 22678 21:40:01,892 --> 21:40:03,812 or you are not registered,\nbut we're literally 22679 21:40:03,812 --> 21:40:05,892 doing nothing with this information. 22680 21:40:05,892 --> 21:40:08,532 So how might we go\nabout implementing this? 22681 21:40:08,532 --> 21:40:10,502 Well, let me go ahead\nand close these tabs 22682 21:40:10,501 --> 21:40:15,901 and let me go into what I call version\n 22683 21:40:15,902 --> 21:40:19,711 And let me go into my source\nnine directory, FroshIMS3 22684 21:40:19,711 --> 21:40:22,652 and let me go ahead and open up app.py. 22685 21:40:22,652 --> 21:40:24,211 So this is a premade version. 22686 21:40:24,211 --> 21:40:26,141 I've gotten rid of\nfootball, in this case. 22687 21:40:26,142 --> 21:40:28,862 But I've added one\nthing at the very top. 22688 21:40:28,861 --> 21:40:33,871 What's, in English, does\nthis represent on line seven? 22689 21:40:33,872 --> 21:40:35,911 What would you describe\nwhat that thing is? 22690 21:40:41,452 --> 21:40:42,172 AUDIENCE: It's an empty dictionary. 22691 21:40:42,172 --> 21:40:43,372 DAVID: Yeah, it's an\nempty dictionary, right? 22692 21:40:43,372 --> 21:40:45,622 Registrants is apparently\na variable on the left. 22693 21:40:45,622 --> 21:40:47,930 It's being assigned an empty\ndictionary on the right. 22694 21:40:47,929 --> 21:40:49,971 And a dictionary, again,\nis just key value pairs. 22695 21:40:49,971 --> 21:40:53,451 Here, again, is where dictionaries\n 22696 21:40:53,960 --> 21:40:56,752 Because this is going to allow me\n 22697 21:40:56,751 --> 21:40:59,271 for ultimate frisbee, Carter\nregistered for soccer 22698 21:40:59,271 --> 21:41:00,831 Emma registered for something else. 22699 21:41:00,831 --> 21:41:04,311 You can associate keys with\nvalues, names with sports 22700 21:41:04,312 --> 21:41:07,532 assuming a model where you can only\n 22701 21:41:07,532 --> 21:41:12,802 And so let's see what the\nlogic is that handles this. 22702 21:41:12,801 --> 21:41:16,281 Here in my register route\nin the code I've premade 22703 21:41:16,282 --> 21:41:18,202 notice that I'm validating\nthe user's name. 22704 21:41:18,202 --> 21:41:20,392 Slightly differently from\nbefore but same idea. 22705 21:41:20,392 --> 21:41:23,482 I'm using request.form.get\nto get the human's name. 22706 21:41:23,482 --> 21:41:26,512 If not name, so if the\nhuman did not type a name 22707 21:41:26,512 --> 21:41:29,241 I'm going to output error.html. 22708 21:41:29,241 --> 21:41:33,652 But notice I've started to make\n 22709 21:41:33,652 --> 21:41:37,551 I'm telling the user, apparently,\n 22710 21:41:38,631 --> 21:41:41,182 I'm apparently passing\nto my error template 22711 21:41:41,182 --> 21:41:44,535 instead of just failure.html,\na specific message. 22712 21:41:44,535 --> 21:41:45,952 So let's go down this rabbit hole. 22713 21:41:45,952 --> 21:41:52,052 Let me actually go into\ntemplates/error.hml, and sure enough 22714 21:41:52,051 --> 21:41:55,762 here's a new file I created here, that\n 22715 21:41:55,762 --> 21:41:59,332 a grumpy cat as part of the error\n 22716 21:41:59,331 --> 21:42:04,581 In my block body I've got an H1 tag\n 22717 21:42:04,581 --> 21:42:07,131 I then have a paragraph\ntag that plugs in whatever 22718 21:42:07,131 --> 21:42:10,881 the error message is that the\ncontroller, app.py, is passing in. 22719 21:42:10,881 --> 21:42:14,211 And then just for fun, I have a\n 22720 21:42:14,211 --> 21:42:15,751 that there was, in fact, an error. 22721 21:42:18,422 --> 21:42:22,792 I do similarly\nrequest.form.get of sport 22722 21:42:22,792 --> 21:42:24,562 and I store it in a\nvariable called sport. 22723 21:42:24,562 --> 21:42:28,351 If there's no such sport, that is the\n 22724 21:42:28,351 --> 21:42:30,771 then I'm going to render\nerror.html two, but I'm 22725 21:42:30,771 --> 21:42:33,361 going to give a different\nmessage, missing sport. 22726 21:42:33,361 --> 21:42:38,451 Else, if the sport they did type in\n 22727 21:42:38,452 --> 21:42:41,991 I'm going to render error.html,\nbut complain differently 22728 21:42:41,991 --> 21:42:44,781 you gave me an invalid sport somehow. 22729 21:42:44,782 --> 21:42:47,032 As if a hacker went into\nthe HTML of the page 22730 21:42:47,032 --> 21:42:49,379 changed it to add their\nown sport like volleyball. 22731 21:42:49,379 --> 21:42:51,711 Even though it's not offered,\nthey submitted volleyball. 22732 21:42:51,711 --> 21:42:55,161 But that's OK, I'm rejecting it, even\n 22733 21:42:55,161 --> 21:42:58,671 tried to send it to me by\nchanging the dom locally. 22734 21:42:58,672 --> 21:43:00,832 And then really, the magic is just this. 22735 21:43:00,831 --> 21:43:03,441 I remember that this\nperson has registered 22736 21:43:03,441 --> 21:43:06,621 by indexing into the\nregistrant dictionary 22737 21:43:06,622 --> 21:43:11,631 using the name the human typed in as the\n 22738 21:43:12,902 --> 21:43:15,232 Well, I added one final route here. 22739 21:43:15,232 --> 21:43:19,672 I have a /registrants route with a\n 22740 21:43:19,672 --> 21:43:21,622 a template called registrants.html. 22741 21:43:21,622 --> 21:43:26,582 But it takes as input that\nglobal variable just like before. 22742 21:43:26,581 --> 21:43:32,961 So let's go down this rabbit hole let me\n 22743 21:43:34,402 --> 21:43:38,182 It looks a little crazy big,\nbut it extends the layout. 22744 21:43:39,381 --> 21:43:42,322 I've got an H1 tag that says\nregistrants, big and bold. 22745 21:43:42,322 --> 21:43:44,691 Then I've got a table\nthat we saw last week. 22746 21:43:44,691 --> 21:43:48,561 This has a table head that just\nsays name sport for two columns. 22747 21:43:48,562 --> 21:43:53,961 Then it has a table body where in,\n 22748 21:43:53,961 --> 21:43:57,172 I'm saying, for each name\nin the registrants variable 22749 21:43:57,172 --> 21:44:01,162 output a table row, start tag,\nand end tag, inside of which 22750 21:44:01,161 --> 21:44:04,341 two table datas, two\ncells, table data for name 22751 21:44:04,342 --> 21:44:08,552 table data for registrants bracket name. 22752 21:44:08,551 --> 21:44:10,971 So it's very similar to Python syntax. 22753 21:44:10,971 --> 21:44:14,861 It essentially is Python syntax, albeit\n 22754 21:44:15,361 --> 21:44:17,511 So the net effect here is what? 22755 21:44:17,512 --> 21:44:20,812 Let me open up my terminal\nwindow, run Flask run. 22756 21:44:20,812 --> 21:44:24,872 Let me now go into the\nform that I premade here. 22757 21:44:26,032 --> 21:44:27,952 Let me go ahead and type in David. 22758 21:44:27,952 --> 21:44:30,322 Let me choose, oh, no sport. 22759 21:44:33,232 --> 21:44:34,881 And there is the grumpy cat. 22760 21:44:34,881 --> 21:44:37,881 So missing sport, though,\nspecifically was outputed. 22761 21:44:38,672 --> 21:44:41,422 Let me go ahead and say no name. 22762 21:44:44,751 --> 21:44:47,241 All right, and let me\nmaliciously, now, do this. 22763 21:44:49,581 --> 21:44:53,691 I'll type my name, sure, but let\n 22764 21:44:53,691 --> 21:44:57,891 Let me maliciously go down in ultimate\n 22765 21:44:59,542 --> 21:45:04,342 Change that and change\nthis to volleyball. 22766 21:45:05,122 --> 21:45:08,991 So now, I can register for\nany sport I want to create. 22767 21:45:08,991 --> 21:45:11,961 Let me click register,\nbut invalid sports. 22768 21:45:11,961 --> 21:45:14,182 So again, that speaks to\nthe power and the need 22769 21:45:14,182 --> 21:45:17,361 for checking things on backend\nand not trusting users. 22770 21:45:17,361 --> 21:45:21,591 It is that easy to hack websites\n 22771 21:45:22,292 --> 21:45:24,292 All right, finally, let's\njust do this for real. 22772 21:45:24,292 --> 21:45:26,292 David is going to register\nfor ultimate frisbee. 22773 21:45:27,292 --> 21:45:30,562 And now, the output is not\nvery pretty, but notice 22774 21:45:30,562 --> 21:45:32,552 I'm at the registrants route. 22775 21:45:32,551 --> 21:45:34,641 And if I zoom out, I have an HTML table. 22776 21:45:34,642 --> 21:45:38,132 Two columns, name and sport,\nDavid and ultimate frisbee. 22777 21:45:38,131 --> 21:45:41,881 Let me go back to the form, letting me\n 22778 21:45:41,881 --> 21:45:43,322 and registered for basketball. 22779 21:45:44,221 --> 21:45:49,111 Now we see two rows in this table,\n 22780 21:45:49,622 --> 21:45:51,414 And if we do this one\nmore time, maybe Emma 22781 21:45:51,414 --> 21:45:53,762 comes along and registers\nfor soccer register. 22782 21:45:53,762 --> 21:45:58,667 All of this information is being\nstored in this dictionary, now. 22783 21:45:58,667 --> 21:45:59,792 All right, so that's great. 22784 21:45:59,792 --> 21:46:03,991 Now we have a database, albeit in\n 22785 21:46:03,991 --> 21:46:09,039 But why is this, maybe, not\nthe best implementation? 22786 21:46:10,716 --> 21:46:13,619 AUDIENCE: You are storing [INAUDIBLE]. 22787 21:46:18,851 --> 21:46:21,476 So we're only storing this\ndictionary in the computer's memory 22788 21:46:21,476 --> 21:46:24,732 and that's great until I hit\nControl C and kill Flask 22789 21:46:26,172 --> 21:46:29,412 Or the server reboots, or maybe\nI close my laptop or whatever. 22790 21:46:29,411 --> 21:46:32,711 If the server stops running,\nmemory is going to be lost. 22791 21:46:34,182 --> 21:46:37,312 It's thrown away when you lose\npower or stop the program. 22792 21:46:37,312 --> 21:46:39,101 So maybe this isn't the best approach. 22793 21:46:39,101 --> 21:46:41,361 Maybe it would be better\nto use a CSV file. 22794 21:46:41,361 --> 21:46:43,991 And in fact, some 20 years ago,\nthat's literally what I did. 22795 21:46:43,991 --> 21:46:45,891 I stored everything in a CSV file. 22796 21:46:45,892 --> 21:46:48,402 But let's skip that step,\nbecause we already saw last week 22797 21:46:48,402 --> 21:46:51,551 or a couple of weeks ago\nnow, how we can use SQLite. 22798 21:46:51,551 --> 21:46:54,131 Let's see if we can't\nmarry in some SQL here 22799 21:46:54,131 --> 21:46:58,131 to store an actual\ndatabase for the program. 22800 21:46:58,131 --> 21:47:00,072 Let me go back here and\nlet me open up, say 22801 21:47:00,072 --> 21:47:03,611 version four of this, which\nis almost the same but it 22802 21:47:03,611 --> 21:47:05,291 adds a bit more functionality. 22803 21:47:05,292 --> 21:47:10,542 Let me close these tabs and let me\n 22804 21:47:10,542 --> 21:47:13,752 So notice it's almost\nthe same, but at the top 22805 21:47:13,751 --> 21:47:18,251 I'm creating a database connection\n 22806 21:47:18,251 --> 21:47:20,044 So that's a database\nI created in advance. 22807 21:47:20,044 --> 21:47:21,461 So let's go down that rabbit hole. 22808 21:47:22,482 --> 21:47:24,622 Let me make my terminal window bigger. 22809 21:47:24,622 --> 21:47:28,362 Let me run SQLite 3 of FroshIMS.db. 22810 21:47:30,342 --> 21:47:33,432 and let's just infer what\nI designed this to be. 22811 21:47:33,432 --> 21:47:37,721 I have a table called registrants,\n 22812 21:47:37,721 --> 21:47:42,101 An ID column that's an integer, a name\n 22813 21:47:42,101 --> 21:47:44,711 and a sport column that's\nalso text, cannot be null 22814 21:47:44,711 --> 21:47:46,361 and the primary key is just ID. 22815 21:47:46,361 --> 21:47:49,391 So that I have a unique\nID for every registration. 22816 21:47:49,392 --> 21:47:51,912 Let's see if there's\nanyone in there yet. 22817 21:47:51,911 --> 21:47:55,604 Select star from registrants. 22818 21:47:55,604 --> 21:47:56,771 OK, there's no one in there. 22819 21:47:56,771 --> 21:47:58,361 No one is yet registered for sports. 22820 21:47:58,361 --> 21:48:00,941 So let's go back to the\ncode and continue on. 22821 21:48:00,941 --> 21:48:03,792 In my code now, I've got\nthe same global variable 22822 21:48:03,792 --> 21:48:07,032 for validation and\ngeneration of my HTML. 22823 21:48:07,032 --> 21:48:09,522 Looks like my index route is the same. 22824 21:48:09,521 --> 21:48:13,241 It's dynamically generating\nthe menu of sports. 22825 21:48:13,241 --> 21:48:14,899 Interestingly, we'll come back to this. 22826 21:48:14,899 --> 21:48:17,232 There's a deregister route\nthat's going to allow someone 22827 21:48:17,232 --> 21:48:22,092 to deregister themselves if\nthey want to exit the sport 22828 21:48:24,702 --> 21:48:27,342 Here's my new and\nimproved register route. 22829 21:48:27,342 --> 21:48:30,702 Still works on POST, so\nsome mild privacy there. 22830 21:48:30,702 --> 21:48:33,532 I'm validating the\nsubmission as follows. 22831 21:48:33,532 --> 21:48:36,942 I'm getting the user's inputted\nname, the user's inputted sport 22832 21:48:36,941 --> 21:48:41,141 and if it is not a name or\nthe sport is not in sports 22833 21:48:41,142 --> 21:48:42,959 I'm going to render failure.html. 22834 21:48:43,792 --> 21:48:45,142 There's no cat in this version. 22835 21:48:46,542 --> 21:48:50,262 Otherwise, recall how we\nco-mingled SQL and Python before. 22836 21:48:50,262 --> 21:48:53,562 We're using CS50's SQL\nlibrary, but that just 22837 21:48:53,562 --> 21:48:56,892 makes it a little easier to execute\n 22838 21:48:56,892 --> 21:49:00,522 Insert into registrants\nname comma sport. 22839 21:49:00,521 --> 21:49:05,152 What two values, the name and the\n 22840 21:49:05,152 --> 21:49:08,021 And then lastly, and this is a new\n 22841 21:49:08,021 --> 21:49:10,781 explicitly now, Flask\nalso gives you access 22842 21:49:10,782 --> 21:49:17,407 to a redirect function, which is how\n 22843 21:49:17,407 --> 21:49:19,782 and all these other sites we\nplayed around with last week 22844 21:49:19,782 --> 21:49:23,262 we're all implemented redirecting\n 22845 21:49:23,262 --> 21:49:26,741 This Flask function\nredirect comes from my just 22846 21:49:26,741 --> 21:49:30,221 having imported it at the\nvery top of this file. 22847 21:49:30,221 --> 21:49:35,141 It handles the HTTP 301 or 302 or 307\n 22848 21:49:37,211 --> 21:49:42,282 All right, so that's it for\nregistering via this route. 22849 21:49:42,282 --> 21:49:45,912 Let's look at what the\nregistrant's route is. 22850 21:49:45,911 --> 21:49:49,031 Here, we have a new\nroute for /registrants. 22851 21:49:49,032 --> 21:49:52,042 And instead of just iterating\nover a dictionary like before 22852 21:49:52,042 --> 21:49:56,471 we're getting back, let's\nsee, db.execute of select star 22853 21:49:57,312 --> 21:50:00,402 So that's literally the programmatic\n 22854 21:50:00,402 --> 21:50:02,411 That gives me back a\nlist of dictionaries 22855 21:50:02,411 --> 21:50:05,652 each of which represents\none row in the table. 22856 21:50:05,652 --> 21:50:08,831 Then, I'm going to render\nregister and start HTML 22857 21:50:08,831 --> 21:50:12,281 passing in literally\nthat list of dictionaries 22858 21:50:12,282 --> 21:50:15,562 just like using CS50's\nlibrary in the past. 22859 21:50:15,562 --> 21:50:18,522 So let's go and look\nat these-- that form. 22860 21:50:18,521 --> 21:50:24,372 If I go into templates and\nopen up registrants.html 22861 21:50:24,372 --> 21:50:27,372 oh, OK, it's just a table like before. 22862 21:50:27,372 --> 21:50:30,672 And actually, let me change this\nsyntactically for consistency. 22863 21:50:30,672 --> 21:50:36,312 We have a Jinja for loop that\niterates over each registrant 22864 21:50:36,312 --> 21:50:39,672 and for each of them,\noutputs a table row. 22865 21:50:39,672 --> 21:50:40,902 Oh, but this is interesting. 22866 21:50:40,902 --> 21:50:44,622 Instead of just having two columns\n 22867 21:50:44,622 --> 21:50:47,654 notice that I'm also\noutputting a full-fledged form. 22868 21:50:47,653 --> 21:50:49,361 All right, this is\nstarting to get juicy. 22869 21:50:49,361 --> 21:50:52,661 So let's actually go back\nto my terminal window 22870 21:50:52,661 --> 21:50:56,531 run Flask, and actually see what\nthis example looks like now. 22871 21:50:58,812 --> 21:51:00,592 In the home page, it\nlooks exactly the same. 22872 21:51:00,592 --> 21:51:02,175 But let me now register for something. 22873 21:51:02,175 --> 21:51:04,792 David for ultimate frisbee, register. 22874 21:51:08,831 --> 21:51:12,561 David registering for\nultimate frisbee, register. 22875 21:51:13,062 --> 21:51:15,132 So good thing I have deregister. 22876 21:51:15,131 --> 21:51:16,932 So this is what it should now look like. 22877 21:51:16,932 --> 21:51:22,119 I have a page at the route called\n 22878 21:51:22,119 --> 21:51:24,161 columns, name and sport,\nDavid, ultimate frisbee. 22879 21:51:24,161 --> 21:51:25,521 But oh, wait, a third column. 22880 21:51:26,021 --> 21:51:30,461 Because if I view the page source,\n 22881 21:51:30,461 --> 21:51:34,572 For every row in this table, I'm also\n 22882 21:51:36,532 --> 21:51:40,092 But before we see how that works,\n 22883 21:51:40,682 --> 21:51:42,282 So Carter will give you basketball. 22884 21:51:44,961 --> 21:51:48,072 Now, let me go back and let's\nregister Emma for soccer. 22885 21:51:50,051 --> 21:51:55,481 Before we look at that HTML, let's\n 22886 21:51:55,482 --> 21:51:59,741 Let's go into SQLite FroshIMS. 22887 21:51:59,741 --> 21:52:10,152 Let me go into FroshIMS, and let me\n 22888 21:52:10,152 --> 21:52:12,461 And now do select star from registrants. 22889 21:52:12,461 --> 21:52:16,391 And whereas, previously, when I executed\n 22890 21:52:17,471 --> 21:52:21,111 So now we see exactly what's\ngoing on underneath the hood. 22891 21:52:21,111 --> 21:52:24,461 So let's look at this\nform now-- this page now. 22892 21:52:24,461 --> 21:52:29,141 If I want to unregister, deregister\n 22893 21:52:30,941 --> 21:52:33,042 Clicking one of those\nbuttons will indeed 22894 21:52:33,042 --> 21:52:36,021 delete the row from the database. 22895 21:52:36,021 --> 21:52:41,411 But how do we go about linking a web\n 22896 21:52:41,411 --> 21:52:43,451 This is the last piece of the puzzle. 22897 21:52:43,452 --> 21:52:47,232 Up until now, everything's been\nwith forms and also with URLs. 22898 21:52:47,232 --> 21:52:49,362 But what if the user is\nnot typing anything in 22899 21:52:49,361 --> 21:52:51,551 they're just clicking a button? 22900 21:52:53,422 --> 21:52:55,897 Let me go ahead and\nsniff the traffic, which 22901 21:52:55,896 --> 21:52:57,521 you could be in the habit of doing now. 22902 21:52:57,521 --> 21:53:01,331 Any time you're curious how a website\n 22903 21:53:01,331 --> 21:53:05,771 And Carter, shall we\nderegister you from basketball? 22904 21:53:05,771 --> 21:53:09,311 Let's deregister Carter and\nlet's see what just happened. 22905 21:53:09,312 --> 21:53:13,392 If I look at the deregister\nrequest, notice that it's a POST. 22906 21:53:13,392 --> 21:53:16,212 The status code that\neventually came back as 302 22907 21:53:16,211 --> 21:53:18,641 but let's look at the request itself. 22908 21:53:18,642 --> 21:53:21,372 All the headers there we'll ignore. 22909 21:53:21,372 --> 21:53:25,872 The only thing that\nbutton submits, cleverly 22910 21:53:25,872 --> 21:53:29,892 is an ID parameter, a key equaling two. 22911 21:53:29,892 --> 21:53:34,152 What does two presumably\nrepresent or map to? 22912 21:53:34,152 --> 21:53:37,062 Where did this two come from? 22913 21:53:37,062 --> 21:53:40,782 It doesn't say Carter, it\ndoesn't say basketball? 22914 21:53:41,330 --> 21:53:43,122 AUDIENCE: The second\nperson who registered. 22915 21:53:43,122 --> 21:53:44,532 DAVID: The second\nperson that registered. 22916 21:53:44,532 --> 21:53:47,682 So those primary keys that we started\n 22917 21:53:47,682 --> 21:53:50,982 ago, why it's useful to be able to\n 22918 21:53:50,982 --> 21:53:53,232 here is just one of the reasons why. 22919 21:53:53,232 --> 21:53:58,271 If it suffices for me just to\nsend the ID number of the person 22920 21:53:58,271 --> 21:54:03,141 I want to delete from the database,\n 22921 21:54:03,142 --> 21:54:09,322 If I go into app.py and I look at my\n 22922 21:54:11,172 --> 21:54:15,642 I first go into the form, and I get\n 22923 21:54:15,642 --> 21:54:19,872 If there was, in fact, an ID, and\nthe form wasn't somehow empty 22924 21:54:19,872 --> 21:54:21,642 I execute this line of code. 22925 21:54:21,642 --> 21:54:24,642 Delete from registrants where\nID equals question mark 22926 21:54:24,642 --> 21:54:29,202 and then I plug-in that number,\ndeleting Carter and only Carter. 22927 21:54:29,202 --> 21:54:32,472 And I'm not using his name, because\n 22928 21:54:32,471 --> 21:54:34,001 two people named Emma or David? 22929 21:54:34,001 --> 21:54:35,601 You don't want to delete both of them. 22930 21:54:35,601 --> 21:54:39,732 That's why these unique\nIDs are so, so important. 22931 21:54:39,732 --> 21:54:41,622 And here's another reason why. 22932 21:54:41,622 --> 21:54:45,012 You don't want to store\nsome things in URLs. 22933 21:54:45,012 --> 21:54:49,752 Suppose we went to this\nURL, deregister?ID=3. 22934 21:54:53,262 --> 21:54:58,131 Suppose I, maliciously,\nemailed this URL to Emma. 22935 21:54:58,131 --> 21:55:00,131 It doesn't matter so much\nwhat the beginning is 22936 21:55:00,131 --> 21:55:06,251 but supposed I emailed her this URL,\n 22937 21:55:07,691 --> 21:55:10,331 And it uses GET instead of POST. 22938 21:55:10,331 --> 21:55:12,326 What did I just trick her into doing? 22939 21:55:15,369 --> 21:55:17,161 What's going to happen\nif Emma clicks this? 22940 21:55:19,021 --> 21:55:21,521 DAVID: You would trick her\ninto deregistering herself. 22941 21:55:22,021 --> 21:55:24,811 Because if she's logged\ninto this FroshIMS website 22942 21:55:24,812 --> 21:55:28,862 and the URL contains her ID just\nbecause I'm being malicious 22943 21:55:28,861 --> 21:55:31,921 and she clicked on it and\nthe website is using GET 22944 21:55:31,922 --> 21:55:34,711 unfortunately, GET URLs\nare, again, stateful. 22945 21:55:34,711 --> 21:55:36,782 They have state information in the URLs. 22946 21:55:36,782 --> 21:55:39,532 And in this case, it's enough\nto delete the user and boom 22947 21:55:39,532 --> 21:55:42,144 she would have accidentally\nderegistered herself. 22948 21:55:42,144 --> 21:55:43,351 And this is pretty innocuous. 22949 21:55:43,351 --> 21:55:45,872 Suppose that this was\nher bank account trying 22950 21:55:45,872 --> 21:55:47,672 to make a withdrawal or a deposit. 22951 21:55:47,672 --> 21:55:50,822 Suppose that this were some\nother website, a Facebook URL 22952 21:55:50,822 --> 21:55:53,339 trying to trick her into\nposting something automatically. 22953 21:55:53,339 --> 21:55:55,172 Here, too, is another\nconsideration when you 22954 21:55:55,172 --> 21:55:59,402 should use POST versus GET,\nbecause GET requests can 22955 21:55:59,402 --> 21:56:04,331 be plugged into emails sent via Slack\n 22956 21:56:04,331 --> 21:56:06,601 And unless there's a\nprompt saying, are you sure 22957 21:56:06,601 --> 21:56:09,751 you want to deregister\nyourself, you might blindly 22958 21:56:09,751 --> 21:56:11,851 trick the user into being\nvulnerable to what's 22959 21:56:11,851 --> 21:56:14,221 called a cross-site request forgery. 22960 21:56:14,221 --> 21:56:17,401 A fancy way of saying you\ntrick them into clicking a link 22961 21:56:17,402 --> 21:56:21,241 that they shouldn't have, because\n 22962 21:56:21,241 --> 21:56:25,111 All right, any question, then,\non these building blocks? 22963 21:56:26,929 --> 21:56:32,275 AUDIENCE: What do the first\nthing in the instance of the SQL 22964 21:56:32,275 --> 21:56:34,232 [INAUDIBLE] where they\nhave three slashes? 22965 21:56:35,501 --> 21:56:37,456 DAVID: When three columns, you mean? 22966 21:56:37,456 --> 21:56:42,311 AUDIENCE: No, three forward slashes. 22967 21:56:42,312 --> 21:56:44,882 DAVID: The three forward slashes. 22968 21:56:46,407 --> 21:56:49,377 AUDIENCE: Yeah, so I\nthink it's in [INAUDIBLE].. 22969 21:56:52,851 --> 21:56:56,031 DAVID: Sorry, it's in where? 22970 21:56:57,235 --> 21:57:02,065 AUDIENCE: It's in [INAUDIBLE] scroll up. 22971 21:57:06,304 --> 21:57:07,721 DAVID: Sorry, the other direction? 22972 21:57:12,846 --> 21:57:16,407 So please scroll a little bit more. 22973 21:57:16,407 --> 21:57:17,532 DAVID: Keep scrolling more? 22974 21:57:20,081 --> 21:57:27,761 This is a URI, it's typical syntax\n 22975 21:57:27,762 --> 21:57:32,351 protocol, so to speak, which means\n 22976 21:57:32,351 --> 21:57:35,322 :// is just like you and I see in URLs. 22977 21:57:35,322 --> 21:57:37,961 The third slash, essentially,\nmeans current folder. 22978 21:57:38,622 --> 21:57:41,461 So it's a weird curiosity,\nbut it's typical 22979 21:57:41,461 --> 21:57:43,961 whenever you're referring to a\nlocal file and not one that's 22980 21:57:45,251 --> 21:57:48,311 That's a bit of an oversimplification,\n 22981 21:57:48,312 --> 21:57:50,472 Sorry for not clicking earlier. 22982 21:57:50,471 --> 21:57:53,292 All right, let's do one\nother iteration of FroshIMS 22983 21:57:53,292 --> 21:57:56,622 here just to show what I was\n 22984 21:57:56,622 --> 21:57:59,562 was not only storing these\nthings in CSV files, as I recall. 22985 21:57:59,562 --> 21:58:02,047 I was also automatically\ngenerating an email 22986 21:58:02,047 --> 21:58:04,422 to the proctor in charge of\nthe intramural sports program 22987 21:58:04,422 --> 21:58:07,512 so that they would have sort of a\n 22988 21:58:07,512 --> 21:58:09,652 and they could easily\nreply to them as well. 22989 21:58:09,652 --> 21:58:13,282 Let me go into FroshIMS version\nfive, which I precreated here 22990 21:58:13,282 --> 21:58:18,642 and let me go ahead and open\nup, say, app.py this time. 22991 21:58:18,642 --> 21:58:21,132 And this is some code\nthat I wrote in advance. 22992 21:58:21,131 --> 21:58:24,921 And it looks a little scary at first\n 22993 21:58:24,922 --> 21:58:30,312 I have now added the Flask\nmail library to the picture 22994 21:58:30,312 --> 21:58:33,461 by adding Flask mail to\nrequirements.txt and running a command 22995 21:58:33,461 --> 21:58:37,101 to automatically install email\nsupport for Flask as well. 22996 21:58:37,101 --> 21:58:39,851 And this is a little bit\ncryptic, but it's honestly mostly 22997 21:58:39,851 --> 21:58:41,622 copy paste from the documentation. 22998 21:58:41,622 --> 21:58:44,351 What I'm doing here is\nI'm configuring my Flask 22999 21:58:44,351 --> 21:58:47,721 application with a few configuration\nvariables, if you will. 23000 21:58:47,721 --> 21:58:51,221 This is the syntax for that.\n 23001 21:58:51,221 --> 21:58:55,601 comes with Flask that is automatically\n 23002 21:58:55,601 --> 21:58:58,961 on line nine, and I just had to fill\n 23003 21:58:58,961 --> 21:59:01,482 values for the default\nsender address that I 23004 21:59:01,482 --> 21:59:05,472 want to send email as, the default\n 23005 21:59:05,471 --> 21:59:08,651 the port number, the TCP port,\nthat we talked about last week. 23006 21:59:08,652 --> 21:59:12,491 The mail server, I'm going to use\nGmail's smtp.gmail.com server. 23007 21:59:12,491 --> 21:59:14,531 Use TLS, this means use encryption. 23008 21:59:15,672 --> 21:59:18,562 Mail username, this is going\nto grab it from my environment. 23009 21:59:18,562 --> 21:59:21,972 So for security purposes, I didn't\n 23010 21:59:21,971 --> 21:59:24,023 and password into the code. 23011 21:59:24,024 --> 21:59:26,982 So I'm actually storing those in what\n 23012 21:59:26,982 --> 21:59:29,112 You'll see more of these\nin problem set nine 23013 21:59:29,111 --> 21:59:31,121 and it's a very common\nconvention on a server 23014 21:59:31,122 --> 21:59:36,771 in the real world to store sensitive\n 23015 21:59:36,771 --> 21:59:39,641 so that it can be accessed\nwhen your website is running 23016 21:59:39,642 --> 21:59:41,202 but not in your source code. 23017 21:59:41,202 --> 21:59:44,232 It's way too easy if\nyou put credentials 23018 21:59:44,232 --> 21:59:47,532 sensitive stuff in your source\ncode, to post it to GitHub 23019 21:59:47,532 --> 21:59:50,832 or to screenshot it accidentally,\n 23020 21:59:50,831 --> 21:59:55,961 So for today's purposes, know that the\n 23021 21:59:55,961 --> 21:59:57,581 are called environment variables. 23022 21:59:57,581 --> 22:00:01,091 And this is like an out-of-band, a\n 23023 22:00:01,092 --> 22:00:04,122 pairs in the computer's memory\nby running a certain command 23024 22:00:04,122 --> 22:00:06,491 but that never show up\nin your actual code. 23025 22:00:06,491 --> 22:00:09,131 Otherwise, there would be so\nmany usernames and passwords 23026 22:00:09,131 --> 22:00:11,921 accidentally visible on the internet. 23027 22:00:11,922 --> 22:00:14,322 So I've installed this in advance. 23028 22:00:14,322 --> 22:00:16,152 Let me see if I can do this correctly. 23029 22:00:16,152 --> 22:00:19,581 Let me go over to another\ntab in just a moment. 23030 22:00:19,581 --> 22:00:23,441 And here, I have on my second\nscreen here, John Harvards inbox. 23031 22:00:23,441 --> 22:00:26,322 It's currently empty, and I'm\ngoing to go ahead and register 23032 22:00:26,322 --> 22:00:28,542 for some sport as John\nHarvard here, hopefully. 23033 22:00:28,542 --> 22:00:32,831 So let me go ahead and run\nFlask run on this version five. 23034 22:00:32,831 --> 22:00:36,311 Let me go ahead and\nreload the main screen. 23035 22:00:37,001 --> 22:00:38,681 Let me reload the main screen here. 23036 22:00:38,682 --> 22:00:41,411 This time, clearly, I'm\nasking for name and email. 23037 22:00:41,411 --> 22:00:43,497 So name will be John Harvard. 23038 22:00:45,771 --> 22:00:51,342 He'll register for, how about soccer. 23039 22:00:52,902 --> 22:00:57,612 And if I did this correctly, not\n 23040 22:00:57,611 --> 22:01:06,701 seeing you are registered, but when he\n 23041 22:01:06,702 --> 22:01:13,002 crossing his fingers that this\n 23042 22:01:13,001 --> 22:01:15,086 and I promise it did right before class. 23043 22:01:22,851 --> 22:01:25,241 I don't think there's\na mistake this time. 23044 22:01:28,402 --> 22:01:31,822 Let me try something\nover here real quick 23045 22:01:31,822 --> 22:01:35,661 but I don't think this is broken. 23046 22:01:35,661 --> 22:01:39,171 It wouldn't have said\nsuccess if it were. 23047 22:01:39,172 --> 22:01:44,992 I just tried submitting again, so I\n 23048 22:01:44,991 --> 22:01:47,156 Oh, I'm really sad right now. 23049 22:01:53,682 --> 22:01:57,521 DAVID: I could check\nspam, but then it's-- 23050 22:01:57,521 --> 22:02:01,991 not sure we want to show spam here on\n 23051 22:02:13,751 --> 22:02:15,521 Wow, that was a risky click I worried. 23052 22:02:15,521 --> 22:02:19,250 All right, so you are registered\nis the email that I sent out 23053 22:02:19,250 --> 22:02:21,292 and it doesn't have any\nactual information in it. 23054 22:02:21,292 --> 22:02:23,262 But back in the day it\nwould have, because I 23055 22:02:23,262 --> 22:02:25,092 included the student's\nname and their dorm 23056 22:02:25,092 --> 22:02:27,634 and all of the other fields of\ninformation that we asked for. 23057 22:02:27,634 --> 22:02:30,312 So let's just take a quick look\nat how that code might work. 23058 22:02:30,312 --> 22:02:33,522 I did have to configure Gmail\nin a certain way to allow 23059 22:02:33,521 --> 22:02:36,191 what they call, less\nsecure apps using SMTP 23060 22:02:36,191 --> 22:02:38,591 which is the protocol\nused for outbound email. 23061 22:02:38,592 --> 22:02:42,960 But besides setting these things, let's\n 23062 22:02:42,960 --> 22:02:44,502 It's actually pretty straightforward. 23063 22:02:44,501 --> 22:02:47,631 In my register route, I validated\n 23064 22:02:48,861 --> 22:02:52,761 I then confirmed the registration\ndown here, nothing new there. 23065 22:02:52,762 --> 22:02:55,482 All I did was use two new lines of code. 23066 22:02:55,482 --> 22:02:57,983 And it's this easy to automate\nthe sending of emails. 23067 22:02:57,983 --> 22:02:59,691 I apparently have done\nit too many times 23068 22:02:59,691 --> 22:03:01,601 which is why it ended up in spam. 23069 22:03:01,601 --> 22:03:03,581 I created a variable called message. 23070 22:03:03,581 --> 22:03:06,981 I used a message function that\nI must have imported higher up 23071 22:03:08,092 --> 22:03:10,872 Here's, apparently, the subject\nline as the first argument. 23072 22:03:10,872 --> 22:03:15,192 And the second argument is the\nnamed parameter recipients 23073 22:03:15,191 --> 22:03:18,682 which takes a list of emails that\n 23074 22:03:18,682 --> 22:03:21,042 So in brackets, I just\nput the one user's email 23075 22:03:21,042 --> 22:03:24,081 and then mail.send that message. 23076 22:03:24,081 --> 22:03:29,021 So let's scroll back up to see what\n 23077 22:03:30,562 --> 22:03:33,792 Yep, mail is this, which\nI have as a variable 23078 22:03:33,792 --> 22:03:36,221 because I followed the\ndocumentation for this library. 23079 22:03:36,221 --> 22:03:41,809 You simply configure your current app\n 23080 22:03:41,809 --> 22:03:43,601 And if you look up here\nnow, on line seven 23081 22:03:43,601 --> 22:03:46,751 here's the new library\nfrom Flask mail I imported. 23082 22:03:46,751 --> 22:03:50,951 Capital Mail, capital Message, so that\n 23083 22:03:52,452 --> 22:03:55,302 So such a simple thing whether you\n 23084 22:03:55,301 --> 22:03:56,831 you want to do password resets. 23085 22:03:56,831 --> 22:04:00,701 It can be this easy to\nactually generate emails 23086 22:04:00,702 --> 22:04:03,754 provided you have the requisite\naccess and software installed. 23087 22:04:03,754 --> 22:04:05,961 And just to make clear that\nI did add something here 23088 22:04:05,961 --> 22:04:09,101 let me open up my requirements.txt\nfile, and indeed, I 23089 22:04:09,101 --> 22:04:13,812 have both Flask and\nFlask-mail ready to go. 23090 22:04:13,812 --> 22:04:16,812 But I ran the command in\nadvance to actually do that. 23091 22:04:16,812 --> 22:04:22,082 All right, any questions,\nthen, on these examples here? 23092 22:04:23,441 --> 22:04:29,572 So what other pieces might actually\n 23093 22:04:29,572 --> 22:04:32,675 It turns out that a key\ncomponent of most any web 23094 22:04:32,675 --> 22:04:34,842 application nowadays that\nwe haven't touched on yet 23095 22:04:34,842 --> 22:04:38,396 but it'll be one of our final flourishes\n 23096 22:04:38,396 --> 22:04:41,201 And a session is actually\na feature that derives 23097 22:04:41,202 --> 22:04:43,842 from all of the basics we talked\nabout today and last week 23098 22:04:43,842 --> 22:04:47,351 and a session is the technical term for\n 23099 22:04:47,351 --> 22:04:50,922 When you go to amazon.com and you start\n 23100 22:04:50,922 --> 22:04:53,605 they follow you from\npage to page to page. 23101 22:04:53,604 --> 22:04:56,021 Heck if you close your browser,\ncome back to the next day 23102 22:04:56,021 --> 22:04:59,668 they're typically still your shopping\n 23103 22:04:59,668 --> 22:05:01,001 because they want your business. 23104 22:05:01,001 --> 22:05:03,711 They don't want you to have to\nstart from scratch the next day. 23105 22:05:03,711 --> 22:05:07,482 Similarly, when you log\ninto any website these days 23106 22:05:07,482 --> 22:05:11,292 even if it's not an e-commerce thing\n 23107 22:05:11,292 --> 22:05:13,002 you and I are not in\nthe habit of logging 23108 22:05:13,001 --> 22:05:15,191 into every darn page\nwe visit on a website. 23109 22:05:15,191 --> 22:05:19,182 Typically, you log in once, and then\n 23110 22:05:19,182 --> 22:05:21,161 you stay logged into that website. 23111 22:05:21,161 --> 22:05:25,301 So somehow, the website is\nremembering that you have logged in. 23112 22:05:25,301 --> 22:05:27,889 And that is being implemented\nby way of this thing called 23113 22:05:27,889 --> 22:05:29,682 a session, and perhaps\na more familiar term 23114 22:05:29,682 --> 22:05:32,789 that you might know as, and\nworry about, called cookies. 23115 22:05:32,789 --> 22:05:35,122 Let's go ahead and take one\nmore five minute break here. 23116 22:05:35,122 --> 22:05:37,414 And when we come back, we'll\nlook at cookies, sessions 23117 22:05:40,952 --> 22:05:43,961 So the promise now is that\nwe're going to implement 23118 22:05:43,961 --> 22:05:47,381 this notion of a session, which is going\n 23119 22:05:47,381 --> 22:05:50,262 them logged in and even implement\nthings like a shopping cart. 23120 22:05:50,262 --> 22:05:54,642 And the overarching goal here\nis to build an application that 23121 22:05:54,642 --> 22:05:56,142 is, quote unquote, "stateful. 23122 22:05:56,142 --> 22:05:59,322 Again, state refers to information,\n 23123 22:06:01,001 --> 22:06:05,711 And in this context, the curiosity\nis that HTTP is technically 23124 22:06:07,271 --> 22:06:11,381 Once you visit a URL,\nhttp://something, hit Enter 23125 22:06:11,381 --> 22:06:14,471 web page is downloaded to\nyour browser, like that's it. 23126 22:06:14,471 --> 22:06:17,471 You can unplug from the internet,\nyou can turn off your Wi-Fi 23127 22:06:17,471 --> 22:06:19,991 but you still have the web page locally. 23128 22:06:19,991 --> 22:06:23,171 And yet we somehow want to make\nsure that the next time you 23129 22:06:23,172 --> 22:06:25,822 click on a link on that website,\nit doesn't forget who you are. 23130 22:06:25,822 --> 22:06:27,822 Or the next thing you add\nto your shopping cart 23131 22:06:27,822 --> 22:06:29,631 it doesn't forget what\nwas already there. 23132 22:06:29,631 --> 22:06:32,861 So we somehow want to\nmake HTTP stateful 23133 22:06:32,861 --> 22:06:36,191 and we can actually do this using the\n 23134 22:06:36,191 --> 22:06:40,902 So concretely, here's a form you might\n 23135 22:06:42,491 --> 22:06:46,421 And I say rarely because most of\n 23136 22:06:46,422 --> 22:06:49,452 you just stay logged in, pretty\nmuch endlessly, in your browser. 23137 22:06:49,452 --> 22:06:52,152 And that's because Google\nhas made the conscious choice 23138 22:06:52,152 --> 22:06:55,862 to give you a very long session\ntime, maybe a day, a week 23139 22:06:55,861 --> 22:06:57,611 a month, a year, because\nthey don't really 23140 22:06:57,611 --> 22:07:01,271 want to add friction to using their tool\n 23141 22:07:01,271 --> 22:07:04,191 By contrast, there's other\napplications on campus 23142 22:07:04,191 --> 22:07:07,031 including some of the CS50 zone,\n 23143 22:07:07,032 --> 22:07:09,102 Because we want to make\nsure that it's indeed you 23144 22:07:09,101 --> 22:07:13,042 accessing the site, and not a roommate\n 23145 22:07:13,042 --> 22:07:16,822 So once you do fill out this\nform, how does Google subsequently 23146 22:07:16,822 --> 22:07:20,152 know that you are you, and\nwhen you reload the page even 23147 22:07:20,152 --> 22:07:22,911 or open a second tab for\nyour same Gmail account 23148 22:07:22,911 --> 22:07:26,668 how do they know that you're still\n 23149 22:07:26,668 --> 22:07:29,001 Well, let's look underneath\nthe hood of what's going on. 23150 22:07:29,001 --> 22:07:32,451 When you log into Gmail,\nessentially, you initially 23151 22:07:32,452 --> 22:07:35,182 see a form like this\nusing a GET request. 23152 22:07:35,182 --> 22:07:37,851 And the website responds\nlike we saw last week 23153 22:07:37,851 --> 22:07:39,271 with some kind of HTTP response. 23154 22:07:39,271 --> 22:07:41,391 Hopefully 200 OK with the form. 23155 22:07:41,392 --> 22:07:46,222 Meanwhile, the website might\nalso respond with an HTTP header 23156 22:07:46,221 --> 22:07:49,432 that, last week we didn't care\nabout, this week, we now do. 23157 22:07:49,432 --> 22:07:53,002 Whenever you visit a website,\nit is very commonly the case 23158 22:07:53,001 --> 22:07:56,072 that the website is putting\na cookie on your computer. 23159 22:07:56,072 --> 22:07:58,672 And you may generally know\nthat cookies can be bad 23160 22:07:58,672 --> 22:08:03,412 and they track you in some way, and\n 23161 22:08:03,411 --> 22:08:08,811 Without cookies, you could not implement\n 23162 22:08:10,342 --> 22:08:12,741 Unfortunately, they can also\nbe used for ill purposes 23163 22:08:12,741 --> 22:08:16,042 like tracking you on every website and\n 23164 22:08:16,542 --> 22:08:18,471 So with good comes some bad. 23165 22:08:18,471 --> 22:08:21,651 But the basic primitive for\nus, the computer scientist 23166 22:08:21,652 --> 22:08:24,622 boils down to just HTTP headers. 23167 22:08:24,622 --> 22:08:29,512 A cookie is typically a big number,\n 23168 22:08:29,512 --> 22:08:33,741 that a server tells your browser\nto store in memory, or even 23169 22:08:35,402 --> 22:08:38,941 So you can think of it like a file that\n 23170 22:08:38,941 --> 22:08:43,012 And the promise that HTTP\nmakes is that if a server sets 23171 22:08:43,012 --> 22:08:45,982 a cookie on your computer,\nyou will represent 23172 22:08:45,982 --> 22:08:49,982 that same cookie or that same\nvalue on every subsequent request. 23173 22:08:49,982 --> 22:08:51,862 So when you visit the\nwebsite like Gmail 23174 22:08:51,861 --> 22:08:55,611 they plop a cookie on your computer\nlike this with some session 23175 22:08:55,611 --> 22:08:57,741 equals value, some long random value. 23176 22:08:57,741 --> 22:09:00,072 One, two, three, A, B,\nC, something like that. 23177 22:09:00,072 --> 22:09:04,491 And when you then visit another page\n 23178 22:09:04,491 --> 22:09:09,351 you send the opposite header, not\n 23179 22:09:09,351 --> 22:09:11,372 and you send the exact same value. 23180 22:09:11,372 --> 22:09:13,779 It's similar to going to a\nclub or an amusement park 23181 22:09:13,778 --> 22:09:15,861 where you pay once, you\ngo through the gates once 23182 22:09:15,861 --> 22:09:17,781 you get checked by\nsecurity once, and then 23183 22:09:17,782 --> 22:09:22,672 they very often take like a little stamp\n 23184 22:09:22,672 --> 22:09:25,505 And then for you, efficiency-wise,\n 23185 22:09:25,505 --> 22:09:27,839 or later in the evening, you\ncan just present your hand. 23186 22:09:27,839 --> 22:09:29,452 You've been stamped, presumably. 23187 22:09:29,452 --> 22:09:32,122 They've already-- you've\nalready paid, you've 23188 22:09:32,122 --> 22:09:33,661 already been searched or whatnot. 23189 22:09:33,661 --> 22:09:35,631 And so it's this sort\nof fast track ticket 23190 22:09:35,631 --> 22:09:37,461 back into the club, back into the park. 23191 22:09:37,461 --> 22:09:41,241 That's essentially what a\ncookie is doing for you, whereby 23192 22:09:41,241 --> 22:09:43,822 it's a way of reminding the\nwebsite we've already done this 23193 22:09:43,822 --> 22:09:46,051 you already asked me for\nmy username and password. 23194 22:09:46,051 --> 22:09:48,531 This is my path to now come and go. 23195 22:09:48,532 --> 22:09:52,522 Now, unlike this hand stamp, which\n 23196 22:09:52,521 --> 22:09:55,131 or duplicated or kept\non over multiple days 23197 22:09:55,131 --> 22:09:59,941 these cookies are really big, seemingly\n 23198 22:09:59,941 --> 22:10:02,661 So statistically, there's\nno way someone else 23199 22:10:02,661 --> 22:10:05,872 is just going to guess your cookie\n 23200 22:10:05,872 --> 22:10:08,872 very low probability, statistically. 23201 22:10:08,872 --> 22:10:12,711 But this is all it boils down to is this\n 23202 22:10:12,711 --> 22:10:16,771 to send these values back\nand forth in this way. 23203 22:10:16,771 --> 22:10:19,131 So when we actually\ntranslate this, now, to code 23204 22:10:19,131 --> 22:10:21,232 let's do something like\na simple login app. 23205 22:10:21,232 --> 22:10:24,351 Let me go into a folder I made\nin advance today called login. 23206 22:10:24,351 --> 22:10:28,922 And let me code up app.py and\nlet's take a look in here. 23207 22:10:30,471 --> 22:10:31,911 A couple of new things up top. 23208 22:10:31,911 --> 22:10:36,801 If I want to have the ability to\n 23209 22:10:36,801 --> 22:10:40,611 and implement sessions, I'm going\n 23210 22:10:41,551 --> 22:10:44,572 So this is another feature you\nget for free by using a framework 23211 22:10:44,572 --> 22:10:46,801 and not having to implement\nall this yourself. 23212 22:10:46,801 --> 22:10:48,951 And from the Flask\nsession library, I'm going 23213 22:10:48,952 --> 22:10:51,502 to import Session, capital S. Why? 23214 22:10:51,501 --> 22:10:53,751 I'm going to configure\nthe session as follows. 23215 22:10:53,751 --> 22:10:56,751 Long story short, there's different\nways to implement sessions. 23216 22:10:56,751 --> 22:11:00,711 The server can store these\ncookies in a database, in a file 23217 22:11:00,711 --> 22:11:03,152 in memory, in RAM, in other places too. 23218 22:11:03,152 --> 22:11:07,982 We are telling it to store these\n 23219 22:11:07,982 --> 22:11:11,452 So in fact, whenever you use sessions\n 23220 22:11:11,452 --> 22:11:15,292 you'll actually see a folder\n 23221 22:11:15,292 --> 22:11:17,842 inside of which are the\ncookies, essentially 23222 22:11:17,842 --> 22:11:20,211 for any users or friends\nor yourself who've been 23223 22:11:20,211 --> 22:11:22,801 visiting your particular application. 23224 22:11:22,801 --> 22:11:24,561 So I'm setting it to\nuse the file system 23225 22:11:24,562 --> 22:11:26,770 and I don't want them to be\npermanent because I want 23226 22:11:26,770 --> 22:11:29,482 when you close your browser,\nthe session to go away. 23227 22:11:29,482 --> 22:11:32,032 They could be made to be\npermanent and last much longer. 23228 22:11:32,032 --> 22:11:34,222 Then I tell my app to support sessions. 23229 22:11:35,902 --> 22:11:39,572 Let's see what this application actually\n 23230 22:11:39,572 --> 22:11:44,902 Let me go over to my terminal window,\n 23231 22:11:49,202 --> 22:11:51,502 Give it a second to kick back in. 23232 22:11:51,501 --> 22:11:53,361 Let me go ahead and open my URL. 23233 22:11:59,792 --> 22:12:02,226 So this website simply has a login form. 23234 22:12:02,226 --> 22:12:04,101 There's no password,\nthough I could certainly 23235 22:12:04,101 --> 22:12:05,902 add that and check for that too. 23236 22:12:07,411 --> 22:12:10,072 So I'm going to log in as\nmyself, David, and click Login. 23237 22:12:10,072 --> 22:12:12,892 And now notice I'm currently\nat the /login route. 23238 22:12:13,881 --> 22:12:16,206 If I try to go to the\ndefault route, just 23239 22:12:16,206 --> 22:12:18,831 slash, which is where most\nwebsites live by default 23240 22:12:18,831 --> 22:12:21,771 notice that I magically\nget redirected to log in. 23241 22:12:21,771 --> 22:12:24,921 So somehow, my code knows, hey, if\n 23242 22:12:26,691 --> 22:12:29,301 Let me type in my name,\nDavid, and click Login. 23243 22:12:29,301 --> 22:12:32,106 And now notice I am back at slash. 23244 22:12:32,107 --> 22:12:35,332 Chrome is sort of annoyingly hiding\n 23245 22:12:36,482 --> 22:12:39,232 And now notice it says you\nare logged in as David. 23246 22:12:39,892 --> 22:12:43,162 What's cool is notice if I reload\nthe page, it still knows that. 23247 22:12:43,161 --> 22:12:46,581 If I create a second tab and go to\n 23248 22:12:48,293 --> 22:12:50,001 I could keep doing\nthis in multiple tabs 23249 22:12:50,001 --> 22:12:53,721 it's still going to remember me on both\n 23250 22:12:55,051 --> 22:13:00,141 Especially when I click Log Out,\n 23251 22:13:00,142 --> 22:13:01,792 All right, so let's see how this works. 23252 22:13:01,792 --> 22:13:03,741 And it's some basic building blocks. 23253 22:13:03,741 --> 22:13:07,551 Under my /route, notice I have this. 23254 22:13:07,551 --> 22:13:12,421 If there is no name in the session,\nredirect the user to /login. 23255 22:13:12,422 --> 22:13:14,992 So these two lines\ntogether are what implement 23256 22:13:14,991 --> 22:13:19,611 that automatic redirection using\nHTTP 301 or 302 automatically. 23257 22:13:19,611 --> 22:13:21,691 It's handled for me\nwith these two lines. 23258 22:13:23,351 --> 22:13:25,101 All right, let's go\ndown that rabbit hole. 23259 22:13:29,991 --> 22:13:36,652 let me look in my templates\nfolder for my login demo and look 23260 22:13:40,042 --> 22:13:41,631 All right, so what's going on here? 23261 22:13:41,631 --> 22:13:46,221 I extend layout.html,\nI have a block body 23262 22:13:46,221 --> 22:13:47,961 and then I've got some other syntax. 23263 22:13:47,961 --> 22:13:50,504 So we haven't seen this yet,\nbut it's more Jinja stuff, which 23264 22:13:50,504 --> 22:13:52,221 again, is almost identical to Python. 23265 22:13:52,221 --> 22:13:55,311 If there's a name in\nthe session variable 23266 22:13:55,312 --> 22:14:00,082 then literally say you are logged in\n 23267 22:14:00,081 --> 22:14:04,641 And then notice this, I've got a simple\n 23268 22:14:04,642 --> 22:14:07,312 Else, if there is no\nname in the session 23269 22:14:07,312 --> 22:14:11,122 then it apparently says you are not\n 23270 22:14:11,122 --> 22:14:13,612 link to /login and then end diff. 23271 22:14:13,611 --> 22:14:16,221 So again, Jinja does\nnot rely on indentation. 23272 22:14:16,221 --> 22:14:19,221 Recall the HTML and CSS don't\nreally care about indentation 23273 22:14:20,452 --> 22:14:24,022 But in code with Jinja,\nyou need these end tags 23274 22:14:24,021 --> 22:14:26,872 end block, end for, end\nif, to make super obvious 23275 22:14:26,872 --> 22:14:29,092 that you're done with that thought. 23276 22:14:29,092 --> 22:14:32,601 So session is just this\nmagic variable that we now 23277 22:14:32,601 --> 22:14:36,891 have access to because we've\nincluded these two lines of code 23278 22:14:36,892 --> 22:14:41,902 and these that handle that whole\n 23279 22:14:41,902 --> 22:14:43,851 with a different, unique identifier. 23280 22:14:43,851 --> 22:14:46,521 If I made my code space\npublic and I let all of you 23281 22:14:46,521 --> 22:14:50,031 visit the exact same URL, all of\n 23282 22:14:50,032 --> 22:14:52,102 You could all type your\nown names individually 23283 22:14:52,101 --> 22:14:56,092 all log in at the same URL\nusing different sessions. 23284 22:14:56,092 --> 22:14:59,332 And in fact, I would then see,\nif I go into my terminal window 23285 22:14:59,331 --> 22:15:03,991 here and my login directory, notice the\n 23286 22:15:03,991 --> 22:15:07,881 And if I CD into that and type ls,\n 23287 22:15:07,881 --> 22:15:10,551 or actually, I think I\nstarted the server twice. 23288 22:15:11,941 --> 22:15:14,691 I would ultimately have one\nfile for every one of you. 23289 22:15:14,691 --> 22:15:16,701 And that's what's\nbeautiful about sessions 23290 22:15:16,702 --> 22:15:21,292 is it creates the illusion\nof per user storage. 23291 22:15:21,292 --> 22:15:25,622 Inside of my session is my name,\n 23292 22:15:26,482 --> 22:15:30,512 And the same is going to apply to\n 23293 22:15:30,512 --> 22:15:32,211 Let's see how login works here. 23294 22:15:32,211 --> 22:15:36,622 My login route supports both GET and\n 23295 22:15:36,622 --> 22:15:41,192 And notice this, this login route\n 23296 22:15:41,191 --> 22:15:45,394 If the user got to this\nroute via POST, my inference 23297 22:15:45,395 --> 22:15:47,062 is that they must have submitted a form. 23298 22:15:47,562 --> 22:15:50,991 Because that's how I'm going to\n 23299 22:15:50,991 --> 22:15:53,631 And if they did submit\nthe form via POST 23300 22:15:53,631 --> 22:15:56,691 I'm going to store, in\nthe session, at the name 23301 22:15:56,691 --> 22:15:59,091 key, whatever the human's name is. 23302 22:15:59,092 --> 22:16:01,267 And then, I'm going to\nredirect them back to slash. 23303 22:16:01,267 --> 22:16:04,352 Otherwise, I'm going to\nshow them the login form. 23304 22:16:05,631 --> 22:16:08,961 If I go to this login\nform, which lives at 23305 22:16:08,961 --> 22:16:12,861 literally, slash login, by default,\n 23306 22:16:14,721 --> 22:16:16,881 And so that's why I see the form. 23307 22:16:18,411 --> 22:16:22,311 The form, very cleverly,\nsubmits to itself 23308 22:16:22,312 --> 22:16:26,991 like the one route/login submits\nto its same self, /login 23309 22:16:26,991 --> 22:16:29,581 but it uses POST when\nyou submit the form. 23310 22:16:29,581 --> 22:16:33,711 And this is a nice way of having one\n 23311 22:16:36,142 --> 22:16:40,652 When I'm just there visiting /login\n 23312 22:16:40,652 --> 22:16:45,741 But if I submit the form, then this\n 23313 22:16:45,741 --> 22:16:49,732 and this just avoids my having to have\n 23314 22:16:50,301 --> 22:16:55,402 I can just have one route that\nhandles both GET and POST. 23315 22:16:57,182 --> 22:16:58,792 Well, it's as simple as this. 23316 22:16:58,792 --> 22:17:01,221 Change whatever name\nis in the session to be 23317 22:17:01,221 --> 22:17:05,451 none, which is Python's version of null,\n 23318 22:17:06,081 --> 22:17:12,152 Because now, in index.html, I will\n 23319 22:17:13,471 --> 22:17:16,921 And so I'll tell the user\ninstead, you are not logged in. 23320 22:17:18,375 --> 22:17:20,542 I want to say as simple as\nthis is, though I realize 23321 22:17:20,542 --> 22:17:22,402 this is a bunch of steps involved. 23322 22:17:22,402 --> 22:17:25,581 This is the essence of every\nwebsite on the internet that 23323 22:17:25,581 --> 22:17:26,951 has usernames and passwords. 23324 22:17:26,952 --> 22:17:30,202 And we skip the password name step for\n 23325 22:17:30,202 --> 22:17:34,612 but this is how every website out\n 23326 22:17:34,611 --> 22:17:37,221 And how this works,\nultimately, is that as soon 23327 22:17:37,221 --> 22:17:41,001 as you use in Python lines\nlike this and lines like this 23328 22:17:41,001 --> 22:17:45,141 Flask takes care of stamping the\n 23329 22:17:45,142 --> 22:17:50,392 and whenever Flask sees the same\ncookie coming back from a user 23330 22:17:50,392 --> 22:17:53,632 it grabs the appropriate\nfile from that folder 23331 22:17:53,631 --> 22:17:56,042 loads it into the\nsession global variable 23332 22:17:56,042 --> 22:18:01,012 so that your code is now unique\nto that user and their name. 23333 22:18:01,012 --> 22:18:03,922 Let's do one other\nexample with sessions here 23334 22:18:03,922 --> 22:18:06,652 that'll show how we might use\nthese, now, for shopping carts. 23335 22:18:06,652 --> 22:18:09,142 Let me go into the store example here. 23336 22:18:09,142 --> 22:18:10,972 Let me go ahead and\nrun this thing first. 23337 22:18:10,971 --> 22:18:16,851 If I run store in my same\ntab and go back over here 23338 22:18:16,851 --> 22:18:19,881 we'll see a very ugly\ne-commerce site that 23339 22:18:19,881 --> 22:18:21,771 just sells seven different books here. 23340 22:18:21,771 --> 22:18:26,258 But each of these books has a button\n 23341 22:18:26,259 --> 22:18:28,342 All right, well where are\nthese books coming from? 23342 22:18:29,361 --> 22:18:32,751 Let me go into my terminal window again. 23343 22:18:32,751 --> 22:18:35,991 Let me go into this example,\nwhich is called store 23344 22:18:35,991 --> 22:18:40,911 and let me open up about\nindex dot ht-- whoops. 23345 22:18:40,911 --> 22:18:48,381 Let's open up index,\nhow about, books.html 23346 22:18:48,381 --> 22:18:50,671 is the default one, not index this time. 23347 22:18:50,672 --> 22:18:55,461 So if I look here, notice that\nthat route that we just saw 23348 22:18:55,461 --> 22:19:00,172 uses a for loop in Jinja to iterate\n 23349 22:19:00,172 --> 22:19:03,532 and it outputs, in an H2\ntag, the title of the book 23350 22:19:03,532 --> 22:19:05,722 and then another one of these forms. 23351 22:19:08,072 --> 22:19:11,991 Let's go ahead and open up app.py,\n 23352 22:19:11,991 --> 22:19:13,432 what's ticking all of this off. 23353 22:19:13,432 --> 22:19:16,851 Notice that this file is\nimporting session support. 23354 22:19:16,851 --> 22:19:19,851 It's configuring sessions\ndown here, but it's also 23355 22:19:19,851 --> 22:19:22,292 connecting to a store.db file. 23356 22:19:23,842 --> 22:19:29,362 And notice this, in my /route,\nI'm selecting star from books 23357 22:19:29,361 --> 22:19:31,641 which is going to give me\na list of dictionaries 23358 22:19:31,642 --> 22:19:33,652 each of which represents a row of books. 23359 22:19:33,652 --> 22:19:37,881 And I'm going to pass that list of\n 23360 22:19:37,881 --> 22:19:41,601 is why this for loop\nworks the way it does. 23361 22:19:41,601 --> 22:19:43,281 Let's look at this actual database. 23362 22:19:43,282 --> 22:19:48,472 Let me increase my terminal window\n 23363 22:19:50,881 --> 22:19:55,531 It's a book-- it's a table called\n 23364 22:19:55,532 --> 22:19:58,552 Let's do select star\nfrom books semicolon. 23365 22:19:58,551 --> 22:20:01,311 There are the seven books,\neach of which has a unique ID. 23366 22:20:01,312 --> 22:20:03,241 And you might see where this is going. 23367 22:20:03,241 --> 22:20:06,801 If I go to the UI and I look\nat each of these buttons 23368 22:20:06,801 --> 22:20:11,961 for add to cart, just like Amazon might\n 23369 22:20:13,312 --> 22:20:15,264 And what's magical here,\njust like deregister 23370 22:20:15,264 --> 22:20:17,182 even though I didn't\nhighlight it at the time 23371 22:20:17,182 --> 22:20:19,342 there's another type\nof input that allows 23372 22:20:19,342 --> 22:20:23,241 you to specify a value without the\n 23373 22:20:23,241 --> 22:20:26,031 Instead of type equals\ntext or type equals submit 23374 22:20:26,032 --> 22:20:30,992 type equals hidden will put the value in\n 23375 22:20:30,991 --> 22:20:34,101 So that's how I'm saying that\nthe idea of this book is one 23376 22:20:34,101 --> 22:20:38,312 the idea of this book is two, the idea\n 23377 22:20:38,312 --> 22:20:40,882 And each of these forms,\nthen, will submit, apparently 23378 22:20:40,881 --> 22:20:45,842 to /cart using POST and that would\n 23379 22:20:46,592 --> 22:20:48,622 Let me click on one or two of these. 23380 22:20:48,622 --> 22:20:51,592 Let's add the first book, add to cart. 23381 22:20:52,642 --> 22:20:54,922 Notice my route change to /cart. 23382 22:20:54,922 --> 22:20:58,882 All right, let's go back and\nlet's add the book number two. 23383 22:20:59,872 --> 22:21:02,872 And let's skip ahead to the\nseventh book, Deathly Hallows 23384 22:21:02,872 --> 22:21:05,612 and now we have all three books here. 23385 22:21:05,611 --> 22:21:09,291 So what does the cart route do at /cart? 23386 22:21:10,042 --> 22:21:15,351 If I go back to my terminal window,\n 23387 22:21:15,351 --> 22:21:17,721 there's a lot going on\nhere, but let's see. 23388 22:21:17,721 --> 22:21:21,292 So the /cart route\nsupports both GET or POST 23389 22:21:21,292 --> 22:21:24,892 which is a nice way to\nconsolidate things into one URL. 23390 22:21:24,892 --> 22:21:26,792 All right, this is interesting. 23391 22:21:26,792 --> 22:21:30,945 If there is not a, quote\nunquote, "cart" key in session 23392 22:21:30,945 --> 22:21:32,612 we haven't technically seen this syntax. 23393 22:21:32,611 --> 22:21:36,771 But long story short, these lines\n 23394 22:21:37,982 --> 22:21:43,161 It makes sure that there's a cart\n 23395 22:21:43,161 --> 22:21:45,741 and it's by default going\nto be an empty list. 23396 22:21:46,282 --> 22:21:48,282 That just means you have\nan empty shopping cart. 23397 22:21:48,282 --> 22:21:55,612 But if the user visits this route via\n 23398 22:21:55,611 --> 22:21:59,241 they didn't muck with the form in any\n 23399 22:21:59,241 --> 22:22:02,691 they gave me a valid ID, then\nI'm going to use this syntax. 23400 22:22:02,691 --> 22:22:05,542 If session bracket cart is a list-- 23401 22:22:05,542 --> 22:22:07,846 recall from a couple of weeks\nago that dot append just 23402 22:22:07,846 --> 22:22:08,971 adds something to the list. 23403 22:22:08,971 --> 22:22:13,201 So I'm going to add the ID to the\n 23404 22:22:13,202 --> 22:22:18,612 Otherwise, if the user is at /cart\n 23405 22:22:18,611 --> 22:22:22,091 Select star from books where ID is in. 23406 22:22:22,092 --> 22:22:24,932 And this might be syntax\nyou recall from Pset six. 23407 22:22:24,932 --> 22:22:27,542 It lets you look for\nmultiple IDs all at once 23408 22:22:27,542 --> 22:22:29,762 because if I have a list of session-- 23409 22:22:29,762 --> 22:22:34,232 list of IDs in my cart, I can\nget all of those books at once. 23410 22:22:34,232 --> 22:22:36,932 So long story short,\nwhat has happened here? 23411 22:22:36,932 --> 22:22:42,932 I am storing, in the cart, the books\n 23412 22:22:42,932 --> 22:22:45,792 My browser is sending the same\nhand stamp again and again 23413 22:22:45,792 --> 22:22:48,911 which is how this website knows that\n 23414 22:22:48,911 --> 22:22:50,701 and not you or not Carter or not Emma. 23415 22:22:50,702 --> 22:22:54,002 Indeed, if all of us visited the\n 23416 22:22:54,001 --> 22:22:56,791 and allowed that, then we would\nall have our own illusions 23417 22:22:58,411 --> 22:23:00,902 And each of those carts,\nin practice, would just 23418 22:23:00,902 --> 22:23:04,682 be stored in this Flask\nsession directory on the server 23419 22:23:04,682 --> 22:23:06,482 so that the server\ncan keep track of each 23420 22:23:06,482 --> 22:23:09,452 of us using, again, these\ncookie values that are being 23421 22:23:09,452 --> 22:23:13,182 sent back and forth via these headers. 23422 22:23:13,682 --> 22:23:15,991 I know that's a lot,\nbut again, it's just 23423 22:23:15,991 --> 22:23:19,471 the new Python way of\njust leveraging those HTTP 23424 22:23:19,471 --> 22:23:22,121 headers from last week in a clever way. 23425 22:23:22,122 --> 22:23:26,101 Any questions before we look\nat one final set of examples? 23426 22:23:27,410 --> 22:23:31,322 AUDIENCE: [INAUDIBLE] understand how\n 23427 22:23:31,322 --> 22:23:34,744 How does it use [INAUDIBLE],,\nhow do you change [INAUDIBLE]?? 23428 22:23:34,744 --> 22:23:39,145 Because in order to use a GET\nrequest dot [INAUDIBLE] equals 23429 22:23:39,145 --> 22:23:43,081 there has to be an\nexchange in [INAUDIBLE].. 23430 22:23:43,081 --> 22:23:46,292 DAVID: So I think you're\nasking about using the GET 23431 22:23:46,292 --> 22:23:48,461 and POST in the same function. 23432 22:23:48,461 --> 22:23:52,982 So this is just a nice\naesthetic, if you will. 23433 22:23:52,982 --> 22:23:56,432 If I had to have separate\nroutes for GET and POST, I mean 23434 22:23:56,432 --> 22:23:59,191 it literally might mean I need\ntwice as many routes in my file. 23435 22:23:59,191 --> 22:24:01,441 And it just starts to\nget a little annoying. 23436 22:24:01,441 --> 22:24:04,182 And these days, too, in\nterms of user experience 23437 22:24:04,182 --> 22:24:08,161 this is maybe only appeals to the\n 23438 22:24:09,452 --> 22:24:11,461 You don't want to have\nlots of words in the URL 23439 22:24:11,461 --> 22:24:14,922 it's nice if the URLs are nice and\n 23440 22:24:14,922 --> 22:24:21,032 So it's nice if I can centralize all of\n 23441 22:24:21,032 --> 22:24:25,652 only, and not in multiple routes,\none for GET, one for POST. 23442 22:24:25,652 --> 22:24:30,012 It's a little nitpicky of me,\nbut this is commonly done here. 23443 22:24:30,012 --> 22:24:34,622 So what this code here means is\nthat this route, this function 23444 22:24:34,622 --> 22:24:38,222 henceforth will support both\nGET requests and POST requests. 23445 22:24:38,221 --> 22:24:42,211 But then I need to distinguish between\n 23446 22:24:42,211 --> 22:24:44,881 Because if it's a GET request,\nI want to show the cart. 23447 22:24:44,881 --> 22:24:47,342 If it's a POST request, I\nwant to update the cart. 23448 22:24:47,342 --> 22:24:51,542 And the simplest way to do that\n 23449 22:24:51,542 --> 22:24:54,991 In the request variable that we\nimported from Flask up above 23450 22:24:54,991 --> 22:24:57,842 you can check what is the\ncurrent type of request. 23451 22:24:57,842 --> 22:25:00,932 Is it a GET, is it a POST, or\nis it something else altogether? 23452 22:25:02,581 --> 22:25:07,652 If it's a POST, that must mean, because\n 23453 22:25:07,652 --> 22:25:10,922 that the user clicked\nthe Add to Cart button. 23454 22:25:10,922 --> 22:25:16,112 Otherwise, if it's not POST, it's\n 23455 22:25:16,111 --> 22:25:20,371 Then, I just want to show the\nuser the contents of the cart 23456 22:25:20,372 --> 22:25:21,881 and I use these lines instead. 23457 22:25:21,881 --> 22:25:25,682 So it's just one way of avoiding having\n 23458 22:25:26,191 --> 22:25:29,101 You can combine them so long\nas you have a check like this. 23459 22:25:29,101 --> 22:25:35,432 If I really wanted to be pedantic, I\n 23460 22:25:38,012 --> 22:25:40,592 This would be more symmetric,\nbut it's not really necessary 23461 22:25:40,592 --> 22:25:43,512 because I know there's\nonly two possibilities. 23462 22:25:46,092 --> 22:25:48,902 All right, let's do one\nfinal set of examples here 23463 22:25:48,902 --> 22:25:51,092 that's going to tie the\nlast of these features 23464 22:25:51,092 --> 22:25:53,521 together to something\nthat you probably see 23465 22:25:53,521 --> 22:25:56,373 quite often in real-world applications. 23466 22:25:56,373 --> 22:25:58,081 And that, for better\nor for worse, is now 23467 22:25:58,081 --> 22:26:01,421 going to involve tying back in\nsome JavaScript from last week. 23468 22:26:01,422 --> 22:26:03,422 The goal at hand of\nthese examples is not 23469 22:26:03,422 --> 22:26:06,805 to necessarily master how you yourself\n 23470 22:26:06,804 --> 22:26:09,721 code, the JavaScript code, but just\n 23471 22:26:09,721 --> 22:26:11,051 these different languages work. 23472 22:26:11,051 --> 22:26:13,441 So that for final\nprojects, especially if you 23473 22:26:13,441 --> 22:26:16,921 do want to add JavaScript functionality,\n 23474 22:26:16,922 --> 22:26:20,101 you at least have the bare\nbones of a mental model for how 23475 22:26:20,101 --> 22:26:22,542 you can tie these languages together. 23476 22:26:22,542 --> 22:26:26,131 Even though our focus, generally,\n 23477 22:26:26,131 --> 22:26:28,322 than on JavaScript from last week. 23478 22:26:28,322 --> 22:26:33,422 Let me go ahead and open up an example\n 23479 22:26:35,021 --> 22:26:39,092 And let me go into my URL here and\n 23480 22:26:39,092 --> 22:26:44,432 like by default. This has just a simple\n 23481 22:26:44,432 --> 22:26:47,461 Let's take a look at the HTML\nthat just got sent to my browser. 23482 22:26:47,461 --> 22:26:49,572 All right, there's not\nmuch going on here at all. 23483 22:26:49,572 --> 22:26:52,682 So there's a form whose\naction is /search. 23484 22:26:52,682 --> 22:26:54,482 It's going to submit via GET. 23485 22:26:54,482 --> 22:26:58,902 It's going to use a q parameter, just\n 23486 22:26:58,902 --> 22:27:01,872 So this actually looks like the\nGoogle form we did last week. 23487 22:27:01,872 --> 22:27:03,752 So let's see what goes on here. 23488 22:27:03,751 --> 22:27:06,151 Let me search for something like cat. 23489 22:27:09,581 --> 22:27:12,042 all right, so this is actually\na somewhat familiar file. 23490 22:27:12,042 --> 22:27:16,232 What I've gone ahead and done is I've\n 23491 22:27:16,232 --> 22:27:19,052 from a couple of weeks ago\nwhen we first introduced SQL 23492 22:27:19,051 --> 22:27:21,601 and I loaded them into this\ndemo so that you can search 23493 22:27:21,601 --> 22:27:23,411 by keyword for any word you want. 23494 22:27:24,512 --> 22:27:27,661 If we were to do this again, we\n 23495 22:27:27,661 --> 22:27:32,861 that contain D-O-G, dog, as a\nsubstring somewhere and so forth. 23496 22:27:32,861 --> 22:27:34,891 So this is a traditional\nway of doing this. 23497 22:27:34,892 --> 22:27:40,502 Just like in Google, it uses\n/search?q=cat, q=dog, and so forth. 23498 22:27:41,381 --> 22:27:45,432 Well, let's just take a\nquick look at app.py here. 23499 22:27:45,432 --> 22:27:50,702 Let me go into my zero example\nhere, show zero, and open up 23500 22:27:50,702 --> 22:27:53,222 app.py and see what's going on. 23501 22:27:54,881 --> 22:27:57,542 Here's the form, that's\nhow we started today. 23502 22:27:57,542 --> 22:28:00,017 And here is the /search route. 23503 22:28:00,017 --> 22:28:01,142 Well, what's going on here? 23504 22:28:01,142 --> 22:28:02,832 This gets a little interesting. 23505 22:28:02,831 --> 22:28:06,031 So I first select a whole\nbunch of shows by doing this. 23506 22:28:06,032 --> 22:28:10,652 Select star from shows, where\ntitle like question mark. 23507 22:28:10,652 --> 22:28:15,331 And then I'm using some\npercent signs from SQL 23508 22:28:15,331 --> 22:28:17,792 on both the left and the\nright, and I'm plugging 23509 22:28:17,792 --> 22:28:20,211 in whatever the user's input was for q. 23510 22:28:20,211 --> 22:28:22,801 If I didn't use like and\nI used equal instead 23511 22:28:22,801 --> 22:28:25,891 I could get rid of these curly\nbrace, these percent signs 23512 22:28:25,892 --> 22:28:29,522 but then it would have to be a show\n 23513 22:28:29,521 --> 22:28:32,221 to it being like cat or like dog. 23514 22:28:32,221 --> 22:28:35,792 This whole line returns to me a\n 23515 22:28:35,792 --> 22:28:38,682 represents a show in the database. 23516 22:28:38,682 --> 22:28:41,971 And then, I'm passing all of those\n 23517 22:28:41,971 --> 22:28:45,771 So let's just follow that\nbreadcrumb, let's open up shows dot-- 23518 22:28:48,152 --> 22:28:50,331 All right, so this is\nwhere templating gets cool. 23519 22:28:50,331 --> 22:28:53,031 So I just passed back hundreds\nof results, potentially 23520 22:28:53,032 --> 22:28:56,512 but the only thing I'm\noutputting is an unordered list 23521 22:28:56,512 --> 22:29:00,592 and using a Jinja for\nloop and li tag containing 23522 22:29:00,592 --> 22:29:02,506 the titles of each of those shows. 23523 22:29:02,506 --> 22:29:04,881 And just to prove that this\nis indeed a familiar data set 23524 22:29:04,881 --> 22:29:09,801 and I actually simplified it a bit,\n 23525 22:29:09,801 --> 22:29:14,091 I threw away all the other stuff like\n 23526 22:29:14,092 --> 22:29:18,832 and I just have, for instance,\nselect star from shows 23527 22:29:18,831 --> 22:29:21,201 limit 10, just so we can see 10 of them. 23528 22:29:21,202 --> 22:29:23,572 There's 10 of the shows\nfrom that database. 23529 22:29:23,572 --> 22:29:25,982 So that's all that's\nin the database itself. 23530 22:29:25,982 --> 22:29:29,572 So it would look like this is a\npretty vanilla web application. 23531 22:29:29,572 --> 22:29:31,607 It uses GET, it submits\nit to the server 23532 22:29:31,607 --> 22:29:33,982 the server spits out a response,\nand that response, then 23533 22:29:33,982 --> 22:29:38,752 looks like this, which is a huge\n 23534 22:29:40,461 --> 22:29:44,131 But everything else\ncomes from a layout.html. 23535 22:29:44,131 --> 22:29:46,771 All the stuff at the\ntop and at the bottom. 23536 22:29:46,771 --> 22:29:50,781 All right, so these days, though, we're\n 23537 22:29:50,782 --> 22:29:53,482 And you start typing something\nand you don't have to hit Submit 23538 22:29:53,482 --> 22:29:56,357 you don't have to click a button,\n 23539 22:29:56,357 --> 22:29:58,711 Web applications, nowadays,\nare much more dynamic. 23540 22:29:58,711 --> 22:30:01,891 So let's take a look at this\nversion one of this thing. 23541 22:30:01,892 --> 22:30:07,792 Let me go into shows one\nand close my previous tabs 23542 22:30:09,902 --> 22:30:14,601 And it's almost the same thing, but\n 23543 22:30:14,601 --> 22:30:16,625 I'm reloading the form,\nthere's no button now. 23544 22:30:16,625 --> 22:30:18,292 So gone is the need for a submit button. 23545 22:30:18,292 --> 22:30:20,312 I want to implement autocomplete now. 23546 22:30:20,312 --> 22:30:22,912 So let's go ahead and\ntype in C. OK, there's 23547 22:30:22,911 --> 22:30:25,881 every show that starts\nwith C. A, there's 23548 22:30:25,881 --> 22:30:27,861 every show that has C-A in it, rather. 23549 22:30:27,861 --> 22:30:31,101 T, there's every show with C-A-T in it. 23550 22:30:31,101 --> 22:30:34,911 I can start it again and do dog,\n 23551 22:30:34,911 --> 22:30:38,811 And notice my URL never changed,\nthere's no /search route 23552 22:30:40,012 --> 22:30:43,652 With every keystroke, it is\nsearching again and again and again. 23553 22:30:43,652 --> 22:30:46,282 That's a nice UX, user experience,\nbecause it's immediate. 23554 22:30:46,282 --> 22:30:48,532 This is what users are\nused to these days. 23555 22:30:48,532 --> 22:30:53,392 But if I look at the source code\n 23556 22:30:53,392 --> 22:30:58,615 there's just an empty UL by default but\n 23557 22:30:58,615 --> 22:31:00,032 So let's see what's going on here. 23558 22:31:00,032 --> 22:31:03,512 This JavaScript code\nis doing the following. 23559 22:31:03,512 --> 22:31:06,472 Let me zoom in a little bit more. 23560 22:31:06,471 --> 22:31:10,853 This JavaScript code is first\nselecting, with query selector 23561 22:31:10,854 --> 22:31:13,562 which you used this past week,\n 23562 22:31:13,562 --> 22:31:15,502 so that's just getting the text box. 23563 22:31:15,501 --> 22:31:19,136 Then it's adding an event listener\n 23564 22:31:19,137 --> 22:31:21,262 We didn't talk about this\nlast week, but literally 23565 22:31:21,262 --> 22:31:23,362 when you provide any\nkind of input by typing 23566 22:31:23,361 --> 22:31:27,861 by pasting, by any other\nuser interface mechanism 23567 22:31:27,861 --> 22:31:29,671 it triggers an event called input. 23568 22:31:29,672 --> 22:31:31,732 So similar to key press or key up. 23569 22:31:31,732 --> 22:31:35,542 I then have a function, no worries\n 23570 22:31:35,542 --> 22:31:37,244 Then what do I do inside of this? 23571 22:31:37,244 --> 22:31:39,202 All right, so this is\nnew, and this is the part 23572 22:31:39,202 --> 22:31:41,872 that let's just focus on the\nideas and not the syntax. 23573 22:31:41,872 --> 22:31:43,762 JavaScript, nowadays,\ncomes with a function 23574 22:31:43,762 --> 22:31:48,051 called fetch that allows you to\n 23575 22:31:48,051 --> 22:31:49,861 without reloading the whole page. 23576 22:31:49,861 --> 22:31:52,371 You can sort of secretly\ndo it inside of the page. 23577 22:31:53,691 --> 22:31:59,031 slash search question mark q equals\n 23578 22:31:59,032 --> 22:32:03,082 When I get back a response, I want\n 23579 22:32:03,081 --> 22:32:05,331 and store it in a variable called shows. 23580 22:32:05,331 --> 22:32:08,001 And I'm deliberately bouncing\naround, ignoring special words 23581 22:32:08,001 --> 22:32:11,361 like await and await here, but for\n 23582 22:32:11,361 --> 22:32:14,601 A response came back from the\n 23583 22:32:14,601 --> 22:32:16,432 storing it in a variable called shows. 23584 22:32:17,691 --> 22:32:22,461 I'm using query selector to select\n 23585 22:32:22,461 --> 22:32:26,721 and I'm changing its inner HTML\nto be equal to the shows that 23586 22:32:29,572 --> 22:32:32,402 Here's where, again, developer\ntools are quite powerful. 23587 22:32:32,402 --> 22:32:36,652 Let me go ahead and reload this\npage to get rid of everything. 23588 22:32:36,652 --> 22:32:39,952 And let me now open up inspect. 23589 22:32:39,952 --> 22:32:43,292 Let me go to the Network tab and\n 23590 22:32:43,292 --> 22:32:44,542 between my browser and server. 23591 22:32:44,542 --> 22:32:49,221 I'm going to search for C. Notice that\n 23592 22:32:54,202 --> 22:32:57,812 So I didn't even finish my cat\n 23593 22:32:57,812 --> 22:33:02,662 A bunch of response headers, but let's\n 23594 22:33:02,661 --> 22:33:07,311 This is literally the response from the\n 23595 22:33:07,312 --> 22:33:10,372 No UL, no HTML, no\ntitle, no body, nothing. 23596 22:33:11,452 --> 22:33:12,961 And we can actually simulate this. 23597 22:33:12,961 --> 22:33:17,122 Let me manually go to\nthat same URL, q=c, Enter. 23598 22:33:17,122 --> 22:33:19,131 We are just going to get back-- 23599 22:33:20,572 --> 22:33:25,432 slash search q equals c, we are\n 23600 22:33:25,432 --> 22:33:27,892 which if I view source, it's\nnot even a complete web page. 23601 22:33:27,892 --> 22:33:31,312 The browser is trying to show it to me\n 23602 22:33:31,312 --> 22:33:33,862 but it's really just partial HTML. 23603 22:33:33,861 --> 22:33:36,471 But that's perfect,\nbecause this is literally 23604 22:33:36,471 --> 22:33:39,381 what I essentially want my\nPython code to copy paste 23605 22:33:39,381 --> 22:33:42,471 into the otherwise empty UL tag. 23606 22:33:42,471 --> 22:33:46,822 And that's what this JavaScript\ncode then, here, is doing. 23607 22:33:46,822 --> 22:33:51,111 Once it gets back that response from the\n 23608 22:33:51,111 --> 22:33:56,131 to plug all of those li's\ninto the UL after the fact. 23609 22:33:56,131 --> 22:33:58,491 Again, changing the so-called dom. 23610 22:33:58,491 --> 22:34:01,461 But there's a slightly better way\n 23611 22:34:02,932 --> 22:34:07,252 Because if you've got a hundred shows or\n 23612 22:34:08,211 --> 22:34:10,851 Why do I need to send all\nof these stupid HTML tags? 23613 22:34:10,851 --> 22:34:13,952 Why don't I just create those\nwhen I'm ready to create them? 23614 22:34:13,952 --> 22:34:15,682 Well, here's the final flourish. 23615 22:34:15,682 --> 22:34:18,232 Whenever making a web\napplication nowadays 23616 22:34:18,232 --> 22:34:22,402 where client and server keep talking\n 23617 22:34:22,402 --> 22:34:25,191 Gmail does this, literally\nevery cool application 23618 22:34:25,191 --> 22:34:27,531 nowadays you load the\npage once and then it 23619 22:34:27,532 --> 22:34:29,302 keeps on interacting\nwith you without you 23620 22:34:29,301 --> 22:34:31,971 reloading or having to change the URL. 23621 22:34:31,971 --> 22:34:36,471 Let's actually use a format called\n 23622 22:34:36,471 --> 22:34:39,561 is to say there's just a\nbetter, more efficient, better 23623 22:34:39,562 --> 22:34:42,152 designed way to send that same data. 23624 22:34:42,152 --> 22:34:46,432 I'm going to go into shows\ntwo now and do Flask run. 23625 22:34:46,432 --> 22:34:48,172 And I'm going to go\nback to my page here. 23626 22:34:48,172 --> 22:34:52,522 The user interface is exactly the same,\n 23627 22:34:52,521 --> 22:34:56,551 Here's C, C-A, C-A-T, and so forth. 23628 22:34:56,551 --> 22:34:59,031 But let's see what's coming back now. 23629 22:34:59,032 --> 22:35:09,152 If I go to /search?q=cat, Enter, notice\n 23630 22:35:09,152 --> 22:35:12,622 But the fact that it's so\ncompact is actually a good thing. 23631 22:35:12,622 --> 22:35:16,372 This is actually going to-- let\n 23632 22:35:17,452 --> 22:35:20,752 This is what's called\nJavaScript Object Notation. 23633 22:35:20,751 --> 22:35:26,541 In JavaScript, an angle-- a square\n 23634 22:35:26,542 --> 22:35:32,542 In JavaScript, a curly bracket says\n 23635 22:35:36,782 --> 22:35:42,772 Yes, sort of recall that you can now\n 23636 22:35:42,771 --> 22:35:45,122 notation using colons like this. 23637 22:35:45,122 --> 22:35:48,711 So long story short, cryptic\nas this is to you and me 23638 22:35:48,711 --> 22:35:52,072 and not very human friendly,\nit's very machine friendly. 23639 22:35:52,072 --> 22:35:55,792 Because for every\ntitle in that database 23640 22:35:55,792 --> 22:36:01,562 I get back its ID and its title, its\n 23641 22:36:01,562 --> 22:36:05,632 And this is a very generic format that\n 23642 22:36:05,631 --> 22:36:07,101 interface, might return to you. 23643 22:36:07,101 --> 22:36:09,141 And this is how APIs, nowadays, work. 23644 22:36:09,142 --> 22:36:13,882 You get back very raw textual\ndata in this format, JSON format 23645 22:36:13,881 --> 22:36:17,601 and then you can write code that\nactually programmatically turns 23646 22:36:17,601 --> 22:36:22,292 that JSON data into any language\nyou want, for instance, HTML. 23647 22:36:22,292 --> 22:36:25,252 So here's the third and final\nversion of this program. 23648 22:36:29,631 --> 22:36:32,452 I then, when I get input,\ncall this function. 23649 22:36:32,452 --> 22:36:38,991 I fetch slash search q equals whatever\n 23650 22:36:38,991 --> 22:36:41,955 I then wait for the response,\nbut instead of getting text 23651 22:36:41,955 --> 22:36:44,872 I'm calling this other function that\n 23652 22:36:44,872 --> 22:36:47,092 called JSON, that just parses that. 23653 22:36:47,092 --> 22:36:51,232 It turns it into a dictionary for me,\n 23654 22:36:51,232 --> 22:36:53,572 and stores it in a\nvariable called shows. 23655 22:36:53,572 --> 22:36:57,831 And this is where you start to see the\n 23656 22:36:57,831 --> 22:37:01,161 Let me initialize a variable called\n 23657 22:37:01,161 --> 22:37:03,961 using single quotes, but I\ncould also use double quotes. 23658 22:37:03,961 --> 22:37:06,232 This is JavaScript syntax for a loop. 23659 22:37:06,232 --> 22:37:10,101 Let me iterate over every\nID in the show's list 23660 22:37:10,101 --> 22:37:13,881 that I just got back in the server,\nthat big chunk of JSON data. 23661 22:37:13,881 --> 22:37:19,191 Let me create a variable called\n 23662 22:37:19,191 --> 22:37:21,652 the title of the show at that ID. 23663 22:37:21,652 --> 22:37:23,661 But for reasons we'll\ncome back to, let me 23664 22:37:23,661 --> 22:37:25,581 replace a couple of scary characters. 23665 22:37:25,581 --> 22:37:31,671 Then let me dynamically add to this\n 23666 22:37:33,232 --> 22:37:35,991 And then very lastly,\nafter this for loop 23667 22:37:35,991 --> 22:37:42,081 let me update the ULs in our HTML to\n 23668 22:37:42,081 --> 22:37:44,661 So in short, don't worry\ntoo much about the syntax 23669 22:37:44,661 --> 22:37:47,119 because you won't need to use\nthis unless you start playing 23670 22:37:47,119 --> 22:37:49,231 with more advanced features quite soon. 23671 22:37:49,232 --> 22:37:51,202 But what we're doing is,\nwith JavaScript, we're 23672 22:37:51,202 --> 22:37:54,122 creating a bigger and bigger\nand bigger string of HTML 23673 22:37:54,122 --> 22:37:57,562 containing all of the open brackets,\n 23674 22:37:57,562 --> 22:38:00,922 but we're just grabbing the\nraw data from the server. 23675 22:38:00,922 --> 22:38:02,932 And so in fact in\nproblem set nine, you're 23676 22:38:02,932 --> 22:38:06,652 going to use a real world third\n 23677 22:38:06,652 --> 22:38:08,092 interface, for which you sign up. 23678 22:38:08,092 --> 22:38:10,851 The data you're going to get\nback from that API is not 23679 22:38:10,851 --> 22:38:14,301 going to be show titles, but\nactually stock quotes and stocks 23680 22:38:14,301 --> 22:38:17,042 ticker symbols and the prices of last-- 23681 22:38:17,042 --> 22:38:19,072 at which stocks were\nlast bought or sold 23682 22:38:19,072 --> 22:38:21,880 and you're going to get that\ndata back in JSON format. 23683 22:38:21,880 --> 22:38:24,922 And you're going to write a bit of\n 23684 22:38:24,922 --> 22:38:27,992 to the requisite HTML on the page. 23685 22:38:27,991 --> 22:38:31,281 So the final result here is\nliterally the kind of autocomplete 23686 22:38:31,282 --> 22:38:33,652 that you and I see and\ntake for granted every day 23687 22:38:33,652 --> 22:38:35,812 and that's ultimately how it works. 23688 22:38:35,812 --> 22:38:39,442 HTML and CSS are used to present\nthe data, your so-called view. 23689 22:38:39,441 --> 22:38:43,861 Python might be used to send or\n 23690 22:38:43,861 --> 22:38:46,131 And then lastly, JavaScript\nis going to be used 23691 22:38:46,131 --> 22:38:48,519 to make things dynamic and interactive. 23692 22:38:48,519 --> 22:38:50,601 So I know that's a whole\nbunch of building blocks 23693 22:38:50,601 --> 22:38:53,394 but the whole point of problem set\n 23694 22:38:53,394 --> 22:38:55,911 set the stage for hopefully a\nvery successful final project. 23695 22:38:55,911 --> 22:38:58,161 Why don't we go ahead and\nwrap up there, and we'll see 23696 22:38:58,161 --> 22:39:01,372 you one last time next week for emoji. 1952971

Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.