All language subtitles for sql-720p-en

af Afrikaans
sq Albanian
am Amharic
ar Arabic
hy Armenian
az Azerbaijani
eu Basque
be Belarusian
bn Bengali
bs Bosnian
bg Bulgarian
ca Catalan
ceb Cebuano
ny Chichewa
zh-CN Chinese (Simplified)
zh-TW Chinese (Traditional)
co Corsican
hr Croatian
cs Czech
da Danish
nl Dutch
en English
eo Esperanto
et Estonian
tl Filipino
fi Finnish
fr French
fy Frisian
gl Galician
ka Georgian
de German
el Greek
gu Gujarati
ht Haitian Creole
ha Hausa
haw Hawaiian
iw Hebrew
hi Hindi
hmn Hmong
hu Hungarian
is Icelandic
ig Igbo
id Indonesian
ga Irish
it Italian
ja Japanese
jw Javanese
kn Kannada
kk Kazakh
km Khmer
ko Korean
ku Kurdish (Kurmanji)
ky Kyrgyz
lo Lao
la Latin
lv Latvian
lt Lithuanian
lb Luxembourgish
mk Macedonian
mg Malagasy
ms Malay
ml Malayalam
mt Maltese
mi Maori
mr Marathi
mn Mongolian
my Myanmar (Burmese)
ne Nepali
no Norwegian
ps Pashto
fa Persian
pl Polish
pt Portuguese
pa Punjabi
ro Romanian
ru Russian
sm Samoan
gd Scots Gaelic
sr Serbian
st Sesotho
sn Shona
sd Sindhi
si Sinhala
sk Slovak
sl Slovenian
so Somali
es Spanish
su Sundanese
sw Swahili
sv Swedish
tg Tajik
ta Tamil
te Telugu
th Thai
tr Turkish
uk Ukrainian Download
ur Urdu
uz Uzbek
vi Vietnamese
cy Welsh
xh Xhosa
yi Yiddish
yo Yoruba
zu Zulu
or Odia (Oriya)
rw Kinyarwanda
tk Turkmen
tt Tatar
ug Uyghur
Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 0 00:00:05,000 --> 00:00:05,250 1 00:00:05,250 --> 00:00:07,250 DOUG LLOYD: So in our videos on web development, 2 00:00:07,250 --> 00:00:10,420 we've talked a couple of times now about what a database is. 3 00:00:10,420 --> 00:00:12,920 And in this video, we're going to get into a bit more detail 4 00:00:12,920 --> 00:00:16,820 and show you exactly what a database is, why we would use it, 5 00:00:16,820 --> 00:00:18,920 and how we might manipulate it. 6 00:00:18,920 --> 00:00:23,844 In order for us to build websites that are sort of more complex than just 7 00:00:23,844 --> 00:00:26,010 a page where they just go and see news, for example, 8 00:00:26,010 --> 00:00:27,926 we might need a database to store information, 9 00:00:27,926 --> 00:00:30,620 such as username and password combinations 10 00:00:30,620 --> 00:00:32,830 so that a user attempts to log in. 11 00:00:32,830 --> 00:00:34,830 That log-in information is sent to the database. 12 00:00:34,830 --> 00:00:36,740 It is checked against information in the database 13 00:00:36,740 --> 00:00:39,230 to see whether that username-password combination matches. 14 00:00:39,230 --> 00:00:41,400 And if so, it lets the user in. 15 00:00:41,400 --> 00:00:44,990 We might also store other stuff for users, like their shopping history 16 00:00:44,990 --> 00:00:47,510 or really any other information that you might 17 00:00:47,510 --> 00:00:50,750 want to keep long-term for a user. 18 00:00:50,750 --> 00:00:54,740 Now, if you ever used programs like Microsoft Excel or Google Sheets 19 00:00:54,740 --> 00:01:00,830 or Numbers, you're probably familiar with the basic idea of a database. 20 00:01:00,830 --> 00:01:03,584 A database consists of a couple of different levels of hierarchy. 21 00:01:03,584 --> 00:01:05,000 Within a database, we have tables. 22 00:01:05,000 --> 00:01:08,130 And within each of those tables, we have rows and columns. 23 00:01:08,130 --> 00:01:10,880 And if you are familiar with Excel, or perhaps even if you're not, 24 00:01:10,880 --> 00:01:14,420 let's take a quick second and I can draw this analogy for you 25 00:01:14,420 --> 00:01:17,700 about what a database, a table, a row, a column is. 26 00:01:17,700 --> 00:01:22,700 And then we can translate that in just a moment to the same idea in SQL. 27 00:01:22,700 --> 00:01:26,769 So on my screen here, I've opened up an Excel file. 28 00:01:26,769 --> 00:01:29,060 And across the top here, you see I have these letters-- 29 00:01:29,060 --> 00:01:34,400 A, B, C, D, E. These would be different columns in my file. 30 00:01:34,400 --> 00:01:37,800 Down the left side, I have numbered rows, where I might put information. 31 00:01:37,800 --> 00:01:41,130 So, for example, I might just put my name here in cell A1. 32 00:01:41,130 --> 00:01:43,672 So it's in the A column in the first row. 33 00:01:43,672 --> 00:01:45,380 Now, down at the bottom left is a feature 34 00:01:45,380 --> 00:01:48,830 that you may not have used too much if you don't use Excel very frequently. 35 00:01:48,830 --> 00:01:50,660 But there's these notions of sheets. 36 00:01:50,660 --> 00:01:53,060 Sheets are sort of akin to different tables. 37 00:01:53,060 --> 00:01:56,780 Notice that when I switch to sheet 2, the data that was in sheet 1, my name, 38 00:01:56,780 --> 00:01:57,870 is no longer there. 39 00:01:57,870 --> 00:02:02,000 So each sheet has its own unique set of rows and columns. 40 00:02:02,000 --> 00:02:06,480 But all of these sheets are still bound up in one single file called Book 1. 41 00:02:06,480 --> 00:02:09,380 So the analogy here is that Book 1 is our database, which 42 00:02:09,380 --> 00:02:12,770 contains a different number of tables, which in Excel parlance 43 00:02:12,770 --> 00:02:14,090 is just a sheet. 44 00:02:14,090 --> 00:02:16,850 And each sheet has columns that we can put data in 45 00:02:16,850 --> 00:02:20,010 and rows that we can put data in, as well. 46 00:02:20,010 --> 00:02:23,120 So there are a couple of different types of database engines 47 00:02:23,120 --> 00:02:24,860 that we can use in our programs. 48 00:02:24,860 --> 00:02:29,780 We're going to talk about SQL, S-Q-L, which stands for the Structured Query 49 00:02:29,780 --> 00:02:30,921 Language. 50 00:02:30,921 --> 00:02:33,170 The Structure Query Language is a programming language 51 00:02:33,170 --> 00:02:37,160 whose sole purpose in life is to query or ask questions of or retrieve data 52 00:02:37,160 --> 00:02:39,120 from a database. 53 00:02:39,120 --> 00:02:42,361 And there are many different implementations of SQL. 54 00:02:42,361 --> 00:02:44,360 Two of the most popular are as follows-- we have 55 00:02:44,360 --> 00:02:47,510 MySQL which is an open-source platform. 56 00:02:47,510 --> 00:02:50,230 It is very commonly used to establish relational databases. 57 00:02:50,230 --> 00:02:51,260 We're not going to get into a lot of detail 58 00:02:51,260 --> 00:02:53,359 in this video about what a relational database is 59 00:02:53,359 --> 00:02:54,650 versus other types of database. 60 00:02:54,650 --> 00:02:56,649 But just know that there are more than one type, 61 00:02:56,649 --> 00:02:59,640 and we cover relational databases in this course. 62 00:02:59,640 --> 00:03:03,380 Another type, which we actually have used in CS50 since 2016, 63 00:03:03,380 --> 00:03:07,834 is SQLite, which has a very similar feature set to MySQL. 64 00:03:07,834 --> 00:03:09,500 But it's just a little more lightweight. 65 00:03:09,500 --> 00:03:12,020 It's a little easier to use on CS50 IDE. 66 00:03:12,020 --> 00:03:15,140 And so that's why we're actually going to be using that one in the class. 67 00:03:15,140 --> 00:03:20,540 Now, regardless of which implementation of SQL that you use, a lot of them 68 00:03:20,540 --> 00:03:25,280 will come with a tool called phpMyAdmin, which is a GUI or Graphical User 69 00:03:25,280 --> 00:03:28,760 Interface tool that is used to execute some 70 00:03:28,760 --> 00:03:33,410 of the more tedious or mundane database queries in a more user-m friendly way 71 00:03:33,410 --> 00:03:36,070 because you can just click and do things in the web browser. 72 00:03:36,070 --> 00:03:39,620 And those are most commonly used to build databases in the first place 73 00:03:39,620 --> 00:03:44,030 and to set up tables because that's the first thing you're 74 00:03:44,030 --> 00:03:47,522 going to do once you get your database configured, is to create a table. 75 00:03:47,522 --> 00:03:49,730 Because without a table, we have no rows and columns. 76 00:03:49,730 --> 00:03:52,430 And without rows and columns, we can't store any data. 77 00:03:52,430 --> 00:03:55,340 Tables have very cumbersome syntax that is used to set them up. 78 00:03:55,340 --> 00:04:00,200 And so using phpMyAdmin, the graphical interface, to build your table 79 00:04:00,200 --> 00:04:02,420 is definitely gonna come in handy. 80 00:04:02,420 --> 00:04:04,910 In the process of building your table, you're 81 00:04:04,910 --> 00:04:08,149 going to have to specify exactly which columns 82 00:04:08,149 --> 00:04:09,690 are going to be stored in that table. 83 00:04:09,690 --> 00:04:12,273 So at the very beginning, when you create your table, you say, 84 00:04:12,273 --> 00:04:14,850 my table's going to store usernames and passwords 85 00:04:14,850 --> 00:04:16,850 and whatever sort of other information you want. 86 00:04:16,850 --> 00:04:20,150 and you have to specify that before you have inserted any data into the table. 87 00:04:20,150 --> 00:04:24,410 So you have to design ahead of time what your table is going to look like. 88 00:04:24,410 --> 00:04:27,650 Once you've done that, pretty much every query except for, like, 89 00:04:27,650 --> 00:04:29,744 deleting the database and deleting the table 90 00:04:29,744 --> 00:04:31,910 that you're going to use on that table going forward 91 00:04:31,910 --> 00:04:33,620 is going to refer to data which is stored 92 00:04:33,620 --> 00:04:37,700 in the different rows of the table. 93 00:04:37,700 --> 00:04:40,310 Just like in C, every column of our SQL table 94 00:04:40,310 --> 00:04:43,420 is capable of holding data of different data types. 95 00:04:43,420 --> 00:04:48,080 So in C, for example, we had characters and strings and integers and floats. 96 00:04:48,080 --> 00:04:51,110 And SQL has a few more data types than that. 97 00:04:51,110 --> 00:04:54,030 This is just 20 of them, and this is not even an exhaustive list. 98 00:04:54,030 --> 00:04:55,610 But some of these things should look familiar. 99 00:04:55,610 --> 00:04:57,901 So, for example, we have int, which can store integers. 100 00:04:57,901 --> 00:05:00,830 But we also have these four other types, which can hold integers 101 00:05:00,830 --> 00:05:02,360 with different upper bounds. 102 00:05:02,360 --> 00:05:05,200 So you may recall from C that the upper bound of integer 103 00:05:05,200 --> 00:05:10,220 is 2 to the 31st power, or 2 to the 32nd power if they're unsigned integers. 104 00:05:10,220 --> 00:05:12,620 But here we can say small ints or tiny ints 105 00:05:12,620 --> 00:05:14,990 or medium or big ints, each of which have 106 00:05:14,990 --> 00:05:17,690 different upper bounds on the values. 107 00:05:17,690 --> 00:05:21,500 Decimal and float stand in place of double and float, 108 00:05:21,500 --> 00:05:25,850 which we're familiar with from C. We can also store date and time stamps 109 00:05:25,850 --> 00:05:27,590 in SQL databases. 110 00:05:27,590 --> 00:05:29,860 There's no data type for that that's native to C. 111 00:05:29,860 --> 00:05:32,760 But in SQL, there are a couple of different ways to do this. 112 00:05:32,760 --> 00:05:35,940 We can even do more exotic things like store geometry or line strings. 113 00:05:35,940 --> 00:05:36,830 And what are these? 114 00:05:36,830 --> 00:05:41,330 Well, geometry and line strings can be used to store in a SQL database 115 00:05:41,330 --> 00:05:45,820 a mapping out or a drawing out of an area on a map, such as using GIS data. 116 00:05:45,820 --> 00:05:48,440 And we can actually store that in our SQL table 117 00:05:48,440 --> 00:05:52,310 and recreate that exact drawing a little bit later on. 118 00:05:52,310 --> 00:05:55,160 Text sort of stands in the stead of strings 119 00:05:55,160 --> 00:06:00,200 for just arbitrarily large chunks of text. 120 00:06:00,200 --> 00:06:04,170 Enums are also a type that exists in C, but we don't talk about them too much 121 00:06:04,170 --> 00:06:08,650 in C. But really quickly, what an enum is is it is a column of your table 122 00:06:08,650 --> 00:06:11,645 that can be used to store a limited set of values. 123 00:06:11,645 --> 00:06:14,930 So, for example, I could have an enum that is called, 124 00:06:14,930 --> 00:06:17,034 like, favorite colors, for example. 125 00:06:17,034 --> 00:06:18,950 And I could specify when I'm building my table 126 00:06:18,950 --> 00:06:22,670 that it can only be capable of holding red, green, and blue. 127 00:06:22,670 --> 00:06:25,880 If you tried to insert a row that had purple in that place, 128 00:06:25,880 --> 00:06:29,510 that wouldn't work because it is not one of the enumerated values that 129 00:06:29,510 --> 00:06:31,520 can be stored in that column. 130 00:06:31,520 --> 00:06:34,310 There are also char and varchar. 131 00:06:34,310 --> 00:06:38,057 And these are not quite the same as what you might think they are in C. 132 00:06:38,057 --> 00:06:39,890 So let's just take a quick second to explain 133 00:06:39,890 --> 00:06:43,520 the difference between these two data types, which are pretty important. 134 00:06:43,520 --> 00:06:47,600 So unlike in C, char does not refer to a single character. 135 00:06:47,600 --> 00:06:51,180 It is actually sort of akin to our notion of a string, 136 00:06:51,180 --> 00:06:54,620 but with a caveat that that string is a fixed length. 137 00:06:54,620 --> 00:06:57,500 And typically when we specify a char or a varchar type, 138 00:06:57,500 --> 00:07:00,680 we have to specify the length of that string at the outset, 139 00:07:00,680 --> 00:07:02,810 just like we do in c. 140 00:07:02,810 --> 00:07:07,460 So, for example, my column type might be a char 10. 141 00:07:07,460 --> 00:07:10,940 That means that I can store 10 character strings 142 00:07:10,940 --> 00:07:15,005 in that column of my table, exactly 10 character strings. 143 00:07:15,005 --> 00:07:16,880 So if I tried to store, for example, the word 144 00:07:16,880 --> 00:07:21,420 "hi," H-I, which is just two letters, that would go into the column OK. 145 00:07:21,420 --> 00:07:24,020 But it would also store 8 extra-- 146 00:07:24,020 --> 00:07:28,130 the equivalent of null bytes, basically, so that I still had 10 characters, 147 00:07:28,130 --> 00:07:34,145 per se, in that column for that row. 148 00:07:34,145 --> 00:07:37,089 And if I tried to store, like, a 15-character string, 149 00:07:37,089 --> 00:07:37,880 that wouldn't work. 150 00:07:37,880 --> 00:07:40,100 I'd only end up storing the first 10 letters. 151 00:07:40,100 --> 00:07:43,430 So it's always going to be 10, every time. 152 00:07:43,430 --> 00:07:46,490 A varchar, on the other hand, refers to a variable-length string. 153 00:07:46,490 --> 00:07:49,490 So if I said that my column was varchar 99, 154 00:07:49,490 --> 00:07:54,740 I can store 1, 2, 3, 4, 5 character strings, up to 99 characters 155 00:07:54,740 --> 00:07:59,090 long, without having to have all of this extra sort of slack space 156 00:07:59,090 --> 00:08:02,360 or null bytes or zeros tacked onto the end. 157 00:08:02,360 --> 00:08:05,100 So char, fixed-length strings. 158 00:08:05,100 --> 00:08:06,557 Varchar, variable-length strings. 159 00:08:06,557 --> 00:08:08,390 We won't get into the difference between why 160 00:08:08,390 --> 00:08:09,440 you might want to use one or the other. 161 00:08:09,440 --> 00:08:12,315 But there are reasons why you might not always want to use a varchar, 162 00:08:12,315 --> 00:08:15,790 and you might want to use a char in some situations. 163 00:08:15,790 --> 00:08:19,840 Now, SQLite actually has many or all of these same data types. 164 00:08:19,840 --> 00:08:22,480 But the difference there is that each of those data types 165 00:08:22,480 --> 00:08:26,220 is affiliated with what's called a type affinity to simplify things. 166 00:08:26,220 --> 00:08:28,150 So that whole list of 20-- 167 00:08:28,150 --> 00:08:32,110 or more, really-- can be reduced to one of these five different affinities-- 168 00:08:32,110 --> 00:08:34,660 null, integer, real, text, and blob. 169 00:08:34,660 --> 00:08:36,700 Null is probably pretty obvious-- nothing. 170 00:08:36,700 --> 00:08:37,919 Integer-- whole numbers. 171 00:08:37,919 --> 00:08:40,929 Real-- that would include things like decimal and float. 172 00:08:40,929 --> 00:08:43,780 Text would include things like char and varchar. 173 00:08:43,780 --> 00:08:48,010 Blob is just-- that would be more like the geometry or data that isn't really 174 00:08:48,010 --> 00:08:51,070 text, but it's just a large number of bits or bytes. 175 00:08:51,070 --> 00:08:52,060 That would be a blob. 176 00:08:52,060 --> 00:08:55,420 And everything can reduce to one of these five different type affinities. 177 00:08:55,420 --> 00:08:58,720 So after specifying the columns, it's also really important in our SQL table 178 00:08:58,720 --> 00:09:01,480 to have one other consideration, which is to have 179 00:09:01,480 --> 00:09:04,150 one column which is our primary key. 180 00:09:04,150 --> 00:09:06,550 Why do we need a primary key? 181 00:09:06,550 --> 00:09:09,220 The reason is every row of our table, in order 182 00:09:09,220 --> 00:09:13,810 to make our SQL queries most effective, should be able to be uniquely 183 00:09:13,810 --> 00:09:15,520 and quickly identified. 184 00:09:15,520 --> 00:09:17,890 So choosing the right primary key will allow 185 00:09:17,890 --> 00:09:23,980 us to make sure that there is one value in every row that is completely unique. 186 00:09:23,980 --> 00:09:27,550 And if that is true, if there is one column in every row that's unique, 187 00:09:27,550 --> 00:09:30,130 then we can uniquely identify or very quickly identify 188 00:09:30,130 --> 00:09:32,500 which row we're talking about. 189 00:09:32,500 --> 00:09:35,090 Now, it's also possible to establish a joint primary key, 190 00:09:35,090 --> 00:09:37,241 which is just a combination of, say, two columns-- 191 00:09:37,241 --> 00:09:39,490 but it can be an arbitrarily large number of columns-- 192 00:09:39,490 --> 00:09:41,114 that is always guaranteed to be unique. 193 00:09:41,114 --> 00:09:44,200 So I could have one column that could always have A's or B's or C's. 194 00:09:44,200 --> 00:09:47,200 I could have another column that has ones and twos and threes and fours. 195 00:09:47,200 --> 00:09:50,440 I could have multiple A's and multiple B's and multiple C's, multiple ones, 196 00:09:50,440 --> 00:09:51,940 twos, and threes. 197 00:09:51,940 --> 00:09:54,080 But across all of those rows, I'm only ever allowed 198 00:09:54,080 --> 00:09:57,160 to have 1 combination of A1. 199 00:09:57,160 --> 00:09:58,630 I can have B1, C1. 200 00:09:58,630 --> 00:10:00,370 I can have A2, A3, A4. 201 00:10:00,370 --> 00:10:04,450 But there's only ever one combination of those two columns that is unique. 202 00:10:04,450 --> 00:10:08,457 That would also be acceptable because that is a joint primary key. 203 00:10:08,457 --> 00:10:11,290 Now, SQL is a programming language, like other programming languages 204 00:10:11,290 --> 00:10:12,190 that we've discussed. 205 00:10:12,190 --> 00:10:14,557 But it has a very limited vocabulary. 206 00:10:14,557 --> 00:10:17,140 Now, there are more things that you can do with SQL than we're 207 00:10:17,140 --> 00:10:18,820 going to talk about in this video because in this video 208 00:10:18,820 --> 00:10:21,191 we're just going to talk about the four operations 209 00:10:21,191 --> 00:10:24,190 that one can perform on a table, or the four main operations that you'll 210 00:10:24,190 --> 00:10:26,260 most likely perform on the table-- 211 00:10:26,260 --> 00:10:29,161 INSERT, SELECT, UPDATE, and DELETE. 212 00:10:29,161 --> 00:10:32,410 And these four things we'll get into a bit more detail in just a second about. 213 00:10:32,410 --> 00:10:34,630 But these are definitely the most common four things 214 00:10:34,630 --> 00:10:36,370 you'll be doing with the tables that you're building, 215 00:10:36,370 --> 00:10:39,460 certainly in CS50, but also probably more generally whenever you're 216 00:10:39,460 --> 00:10:41,269 working with databases. 217 00:10:41,269 --> 00:10:44,060 So for all of the examples that we're going to cover in this video, 218 00:10:44,060 --> 00:10:46,450 we're going to consider a database that contains 219 00:10:46,450 --> 00:10:50,530 these two tables called users and moms. 220 00:10:50,530 --> 00:10:54,370 And you can see users has four different columns-- idnum, username, password, 221 00:10:54,370 --> 00:10:55,720 and fullname. 222 00:10:55,720 --> 00:10:59,684 And moms has two different columns, username and mother. 223 00:10:59,684 --> 00:11:01,600 And let's now start to work with these and see 224 00:11:01,600 --> 00:11:05,860 how we can manipulate this database and the tables within it to-- 225 00:11:05,860 --> 00:11:09,190 for whatever reason we need them to-- be updated in our website. 226 00:11:09,190 --> 00:11:11,320 So the first operation we'll cover is insert. 227 00:11:11,320 --> 00:11:14,230 And as you might expect, it adds information to a table. 228 00:11:14,230 --> 00:11:17,604 Now, SQL queries have a certain structure to them. 229 00:11:17,604 --> 00:11:20,020 And so with each of these operations, what I'm going to do 230 00:11:20,020 --> 00:11:24,680 is give you the general sort of skeleton of what a query might look like. 231 00:11:24,680 --> 00:11:27,070 And then we'll use a couple of examples of that query 232 00:11:27,070 --> 00:11:29,350 to see how it affects our table. 233 00:11:29,350 --> 00:11:33,190 So an insert query, in general, looks something like this-- 234 00:11:33,190 --> 00:11:35,110 INSERT INTO table. 235 00:11:35,110 --> 00:11:37,690 So we specify what table we want to insert into. 236 00:11:37,690 --> 00:11:39,610 And then we have parentheses, columns-- which 237 00:11:39,610 --> 00:11:43,180 is a comma-separated list of all of the columns of our table 238 00:11:43,180 --> 00:11:45,730 that we want to insert data into. 239 00:11:45,730 --> 00:11:49,510 Then we have VALUES, and then a comma-separated list of the values 240 00:11:49,510 --> 00:11:53,150 that we want to put into those columns in the same order. 241 00:11:53,150 --> 00:11:59,110 So for example, I might want to INSERT INTO users into these three columns-- 242 00:11:59,110 --> 00:12:01,540 username, password, name. 243 00:12:01,540 --> 00:12:06,417 The values, respectively-- newman, in lowercase, USMAIL-- 244 00:12:06,417 --> 00:12:08,500 that's going to be corresponding to the password-- 245 00:12:08,500 --> 00:12:11,650 and then capital-N Newman for fullname. 246 00:12:11,650 --> 00:12:17,140 So assuming that I execute this query on the users table shown here, 247 00:12:17,140 --> 00:12:18,279 what's going to happen? 248 00:12:18,279 --> 00:12:19,820 Well, this is what's going to happen. 249 00:12:19,820 --> 00:12:24,350 We're going to add this row because we inserted into the users table. 250 00:12:24,350 --> 00:12:26,350 But you might be asking yourself, wait a minute. 251 00:12:26,350 --> 00:12:29,150 You never specified ID number. 252 00:12:29,150 --> 00:12:31,030 So how exactly did ID number get there? 253 00:12:31,030 --> 00:12:34,740 I mean, you specified newman and USMAIL and Newman. 254 00:12:34,740 --> 00:12:37,120 But that 12, that was never part of our query. 255 00:12:37,120 --> 00:12:38,064 And you're right. 256 00:12:38,064 --> 00:12:40,480 Because when you define the column that ultimately ends up 257 00:12:40,480 --> 00:12:43,414 being your table's primary key, it's usually a good idea 258 00:12:43,414 --> 00:12:44,830 to have that column be an integer. 259 00:12:44,830 --> 00:12:49,150 It's not a requirement, but it's usually a good idea. 260 00:12:49,150 --> 00:12:53,110 And because having a primary key is so important, 261 00:12:53,110 --> 00:12:55,600 you might want to specify it to autoincrement. 262 00:12:55,600 --> 00:12:58,450 And what this basically means is you can forget 263 00:12:58,450 --> 00:13:01,030 to include it as part of your query. 264 00:13:01,030 --> 00:13:05,080 And if you do, it will automatically insert a value there 265 00:13:05,080 --> 00:13:07,930 that is unique from every other value in that column, 266 00:13:07,930 --> 00:13:10,737 typically by just incrementing by one every time, 267 00:13:10,737 --> 00:13:12,820 so that that row is still guaranteed to be unique. 268 00:13:12,820 --> 00:13:15,170 Because if it wasn't set to autoincrement 269 00:13:15,170 --> 00:13:18,690 and we just forgot to omit it, there might 270 00:13:18,690 --> 00:13:22,830 be a couple of rows that are actually blank or null there. 271 00:13:22,830 --> 00:13:25,740 But if that's our primary key, if ID number is our primary key, 272 00:13:25,740 --> 00:13:30,040 and we have a couple of rows that are blank there, now they're not unique. 273 00:13:30,040 --> 00:13:31,707 We have two rows that are blank. 274 00:13:31,707 --> 00:13:33,040 We can't uniquely identify them. 275 00:13:33,040 --> 00:13:36,359 And so when you set up your primary key, you usually want it to be an integer, 276 00:13:36,359 --> 00:13:39,150 and you usually want it to autoincrement so that you can completely 277 00:13:39,150 --> 00:13:42,740 forget about having to worry about that and let the table do it for you. 278 00:13:42,740 --> 00:13:44,100 So that's how 12 got there. 279 00:13:44,100 --> 00:13:48,540 I had specified my idnum column do autoincrement every time I 280 00:13:48,540 --> 00:13:52,010 make an insert into the table. 281 00:13:52,010 --> 00:13:53,230 Let's do this another time. 282 00:13:53,230 --> 00:13:55,813 Let's insert into the other table, insert into the moms table, 283 00:13:55,813 --> 00:13:58,300 the following values into these following columns. 284 00:13:58,300 --> 00:14:01,980 We can INSERT INTO the username and mother columns of the users table 285 00:14:01,980 --> 00:14:05,069 the VALUES kramer and Babs Kramer. 286 00:14:05,069 --> 00:14:08,110 And just like what we saw before, this is what we would have as a result. 287 00:14:08,110 --> 00:14:10,068 We would just insert that row, and it would now 288 00:14:10,068 --> 00:14:13,449 be part of our database and part of our tables. 289 00:14:13,449 --> 00:14:14,740 So that's the insert operation. 290 00:14:14,740 --> 00:14:17,260 Now let's talk about the second operation, which is select. 291 00:14:17,260 --> 00:14:20,200 So if insert is used to put information into a table, 292 00:14:20,200 --> 00:14:23,110 select is used to get that information back out of the table 293 00:14:23,110 --> 00:14:24,850 so that we can do something with it. 294 00:14:24,850 --> 00:14:28,630 Select queries also have a very similar skeleton. 295 00:14:28,630 --> 00:14:29,770 And they look like this-- 296 00:14:29,770 --> 00:14:32,110 SELECT and then whatever columns you want 297 00:14:32,110 --> 00:14:35,766 to choose FROM whichever table you want to choose them from. 298 00:14:35,766 --> 00:14:38,140 And then optionally in red here, these two other things-- 299 00:14:38,140 --> 00:14:40,550 WHERE some condition is satisfied. 300 00:14:40,550 --> 00:14:43,900 And we usually, in the context of SQL, refer to a condition as a predicate. 301 00:14:43,900 --> 00:14:47,740 But basically, we're just checking to see that some situation is true. 302 00:14:47,740 --> 00:14:51,160 And we might want to, for example, order them by a specific column so 303 00:14:51,160 --> 00:14:54,940 that they are organized alphabetically by the values in some column 304 00:14:54,940 --> 00:14:55,750 or whatever else. 305 00:14:55,750 --> 00:14:58,770 We don't have to do either WHERE or ORDER BY. 306 00:14:58,770 --> 00:15:00,520 But generally WHERE in particular is going 307 00:15:00,520 --> 00:15:03,700 to be used so that you don't get your entire database back. 308 00:15:03,700 --> 00:15:06,580 And ORDER BY just helps to keep things a little more organized. 309 00:15:06,580 --> 00:15:09,670 So, for example, here's a select query that we could use-- 310 00:15:09,670 --> 00:15:13,712 SELECT idnum, fullname from the users table. 311 00:15:13,712 --> 00:15:15,420 So let's see what's going to happen here. 312 00:15:15,420 --> 00:15:19,690 So remember, I'm selecting idnum and fullname from the users table. 313 00:15:19,690 --> 00:15:22,420 Well, what is the select query going to return to me? 314 00:15:22,420 --> 00:15:23,647 This information here. 315 00:15:23,647 --> 00:15:25,480 It's going to look at each row of the table, 316 00:15:25,480 --> 00:15:28,960 and it's going to just pull out the ID number and the full name 317 00:15:28,960 --> 00:15:32,050 and give me all of those down the entire users table. 318 00:15:32,050 --> 00:15:34,100 My user table only consists of three rows, 319 00:15:34,100 --> 00:15:36,370 so it's given me three pairs of information, 320 00:15:36,370 --> 00:15:39,930 three idnum-fullname combinations. 321 00:15:39,930 --> 00:15:41,680 Now let's say I want to restrict my search 322 00:15:41,680 --> 00:15:43,180 a little bit, my query a little bit. 323 00:15:43,180 --> 00:15:48,200 SELECT password FROM users WHERE idnum is less than 12. 324 00:15:48,200 --> 00:15:51,704 So now I'm adding a predicate or a condition to my select query. 325 00:15:51,704 --> 00:15:52,870 What's going to happen here? 326 00:15:52,870 --> 00:15:55,480 Well, I'm going to get this information. 327 00:15:55,480 --> 00:15:58,990 This is just the password column from the users table 328 00:15:58,990 --> 00:16:02,360 where the ID number in that row is less than 12. 329 00:16:02,360 --> 00:16:05,830 So only rows 10 and 11's password get given back to me. 330 00:16:05,830 --> 00:16:09,610 12's does not because 12 is not less than 12. 331 00:16:09,610 --> 00:16:13,540 There's one other thing we can do with a select query, and that's to select *. 332 00:16:13,540 --> 00:16:17,750 SELECT * from moms where username equals jerry. 333 00:16:17,750 --> 00:16:20,800 * is just shorthand for every column. 334 00:16:20,800 --> 00:16:24,040 So instead of having to specify username, mother, the two columns here, 335 00:16:24,040 --> 00:16:26,740 or, if I was using the users table, having to specify idnum, 336 00:16:26,740 --> 00:16:29,650 username, password, fullname, I can just say SELECT *. 337 00:16:29,650 --> 00:16:31,680 Give me everything in that row. 338 00:16:31,680 --> 00:16:35,170 And if I SELECT * from moms where username equals jerry, 339 00:16:35,170 --> 00:16:39,096 I am given this set of information back. 340 00:16:39,096 --> 00:16:41,550 So here's the thing that's cool about databases, 341 00:16:41,550 --> 00:16:44,249 though-- we don't have to just have two tables. 342 00:16:44,249 --> 00:16:46,290 And in particular, we don't have to have just one 343 00:16:46,290 --> 00:16:50,130 table that stores every relevant piece of information about a user. 344 00:16:50,130 --> 00:16:53,050 If we wanted to store, in addition to what we have there, information 345 00:16:53,050 --> 00:16:57,754 like their address and their date of birth and their social security number 346 00:16:57,754 --> 00:16:59,670 or whatever else we wanted to keep about them, 347 00:16:59,670 --> 00:17:02,169 this table could get bigger and bigger and bigger and bigger 348 00:17:02,169 --> 00:17:05,760 to the point where it becomes almost annoying to have to use it. 349 00:17:05,760 --> 00:17:08,930 But we can use relationships between different tables. 350 00:17:08,930 --> 00:17:10,680 And that's where the term relational comes 351 00:17:10,680 --> 00:17:12,599 into play for a relational database. 352 00:17:12,599 --> 00:17:15,089 We can set up our tables within our databases 353 00:17:15,089 --> 00:17:18,900 carefully enough that we can use relationships between them 354 00:17:18,900 --> 00:17:22,319 to pull information from where we need it without all that information having 355 00:17:22,319 --> 00:17:25,270 to be located in the same table in the first place. 356 00:17:25,270 --> 00:17:30,480 So, for example, here is our database as we last left it, with three rows 357 00:17:30,480 --> 00:17:35,231 in each of the users and the moms tables. 358 00:17:35,231 --> 00:17:36,980 Now let's imagine a hypothetical situation 359 00:17:36,980 --> 00:17:41,720 where we want to pair a user's full name, which is currently only stored 360 00:17:41,720 --> 00:17:45,080 in the users table, with their mother's name, which 361 00:17:45,080 --> 00:17:47,474 is only stored in the mother table. 362 00:17:47,474 --> 00:17:49,140 They're not on the same table right now. 363 00:17:49,140 --> 00:17:52,190 So we can't use the same sort of select query that we used before. 364 00:17:52,190 --> 00:17:56,000 We have to use what's called a select join query. 365 00:17:56,000 --> 00:17:58,700 It looks pretty similar to the select query before. 366 00:17:58,700 --> 00:18:01,250 But we're adding one extra piece of information, which 367 00:18:01,250 --> 00:18:03,360 is what tables we're joining together. 368 00:18:03,360 --> 00:18:07,610 So I want to select a specific set of columns from one table, 369 00:18:07,610 --> 00:18:10,302 joining another table onto it just temporarily. 370 00:18:10,302 --> 00:18:11,510 It doesn't really merge them. 371 00:18:11,510 --> 00:18:16,700 But it creates this sort of hypothetical table that does merge them. 372 00:18:16,700 --> 00:18:18,679 On predicate, where predicate is-- basically, 373 00:18:18,679 --> 00:18:20,720 we're trying to find where the two tables overlap 374 00:18:20,720 --> 00:18:23,930 so that we can create this merged table just temporarily 375 00:18:23,930 --> 00:18:27,140 for purposes of this query, and then have them separate again. 376 00:18:27,140 --> 00:18:30,320 So, for example, I might have a query that looks like this-- 377 00:18:30,320 --> 00:18:36,620 SELECT users.fullname, comma, moms.mother FROM users joining the moms 378 00:18:36,620 --> 00:18:41,090 table onto it temporarily ON-- 379 00:18:41,090 --> 00:18:44,220 which is akin to WHERE-- 380 00:18:44,220 --> 00:18:48,055 users.username equals moms.username. 381 00:18:48,055 --> 00:18:50,180 And this syntax is also a little unfamiliar, right? 382 00:18:50,180 --> 00:18:53,820 I'm now prepending some of my column names with table names. 383 00:18:53,820 --> 00:18:56,570 The reason I'm doing this is particularly because of the last line 384 00:18:56,570 --> 00:19:03,050 there, where in each of those two tables I had a column called username. 385 00:19:03,050 --> 00:19:05,180 I need to disambiguate between them. 386 00:19:05,180 --> 00:19:07,640 So I need to specify what table and what column. 387 00:19:07,640 --> 00:19:10,931 That's all that's happening here, is I'm being very explicit about which column 388 00:19:10,931 --> 00:19:15,240 I want by specifying not only the column name, but also the table name. 389 00:19:15,240 --> 00:19:19,400 So here is what would happen if I executed this select query. 390 00:19:19,400 --> 00:19:21,980 Here are our two tables at the outset. 391 00:19:21,980 --> 00:19:25,520 This is all of the information that I'm trying to get to start with. 392 00:19:25,520 --> 00:19:31,970 I'm going to join these two tables together where where users.username 393 00:19:31,970 --> 00:19:34,130 equals moms.username. 394 00:19:34,130 --> 00:19:35,730 Newman does not appear in both tables. 395 00:19:35,730 --> 00:19:38,600 So I'm not going to even come close to extracting that piece of information. 396 00:19:38,600 --> 00:19:41,420 Kramer does not appear in both tables, so that one's out too. 397 00:19:41,420 --> 00:19:45,050 But jerry and gcostanza both appear in both tables. 398 00:19:45,050 --> 00:19:47,990 And so that is sort of the overlap point. 399 00:19:47,990 --> 00:19:50,810 And you can think about this as if we have 400 00:19:50,810 --> 00:19:54,600 our users table, which is like one circle over here, 401 00:19:54,600 --> 00:19:56,699 and our moms table, which is one circle over here. 402 00:19:56,699 --> 00:19:59,240 When we're joining, it's sort of like if it's a Venn diagram. 403 00:19:59,240 --> 00:20:01,864 We're trying to figure out what those two tables have in common 404 00:20:01,864 --> 00:20:04,805 and using that as, like, our anchor point for the join tables. 405 00:20:04,805 --> 00:20:05,930 So that's what we did here. 406 00:20:05,930 --> 00:20:08,660 We found where the two tables have data in common. 407 00:20:08,660 --> 00:20:12,200 And based on that, we create this hypothetical table 408 00:20:12,200 --> 00:20:15,800 called users & moms that contains all of the information that 409 00:20:15,800 --> 00:20:18,740 appears in both of those tables. 410 00:20:18,740 --> 00:20:22,490 But notice that the second column there sort of indicates the overlap. 411 00:20:22,490 --> 00:20:27,489 This column refers to data that is in both users.username and moms.username. 412 00:20:27,489 --> 00:20:29,030 So that's what we're extracting here. 413 00:20:29,030 --> 00:20:31,363 But remember that this query, we didn't want everything. 414 00:20:31,363 --> 00:20:35,947 It wasn't SELECT * FROM users JOIN moms WHERE users.username equals 415 00:20:35,947 --> 00:20:36,530 moms.username. 416 00:20:36,530 --> 00:20:40,650 Username We just want the full name and the mother's name. 417 00:20:40,650 --> 00:20:42,200 And so that's what we extract. 418 00:20:42,200 --> 00:20:44,210 And so that really long query just results 419 00:20:44,210 --> 00:20:48,650 in us getting this set of information-- users.fullname and moms.mother 420 00:20:48,650 --> 00:20:52,920 for every row in both tables where the username happens to be the same. 421 00:20:52,920 --> 00:20:53,532 All right. 422 00:20:53,532 --> 00:20:54,990 So that was a bit more complicated. 423 00:20:54,990 --> 00:20:56,540 You probably won't do too many joins. 424 00:20:56,540 --> 00:21:00,210 Fortunately, the last two operations are a little bit easier. 425 00:21:00,210 --> 00:21:02,510 So update is the third of the four major operations. 426 00:21:02,510 --> 00:21:04,850 This is what we use to modify information in the table 427 00:21:04,850 --> 00:21:06,901 or change it or update information. 428 00:21:06,901 --> 00:21:08,150 The skeleton looks like this-- 429 00:21:08,150 --> 00:21:12,740 UPDATE table SET column equals value-- so we're updating some column to have 430 00:21:12,740 --> 00:21:14,510 a new piece of data in it-- 431 00:21:14,510 --> 00:21:15,459 WHERE predicates. 432 00:21:15,459 --> 00:21:17,750 We're trying to figure out which row we want to update, 433 00:21:17,750 --> 00:21:19,800 and then we're updating that column. 434 00:21:19,800 --> 00:21:22,550 So, for example, I want to UPDATE users SET password equals 435 00:21:22,550 --> 00:21:25,566 yadayada WHERE idnum equals 10. 436 00:21:25,566 --> 00:21:27,690 You can probably guess what's going to happen here. 437 00:21:27,690 --> 00:21:29,773 We're going to find the row whose ID number is 10, 438 00:21:29,773 --> 00:21:34,520 and we're going update the password there from "fusilli" to "yadayada." 439 00:21:34,520 --> 00:21:35,630 Pretty straightforward. 440 00:21:35,630 --> 00:21:37,970 And the fourth and final operation that we do in SQL 441 00:21:37,970 --> 00:21:41,120 most commonly is to delete, which strikes or completely removes 442 00:21:41,120 --> 00:21:42,972 information from a table. 443 00:21:42,972 --> 00:21:44,930 The basic framework looks something like this-- 444 00:21:44,930 --> 00:21:48,140 DELETE from some table WHERE predicate, or, again, 445 00:21:48,140 --> 00:21:50,160 where some condition is satisfied. 446 00:21:50,160 --> 00:21:53,360 So, for example, you might want to DELETE from our users table WHERE 447 00:21:53,360 --> 00:21:54,994 the user name equals 'newman.' 448 00:21:54,994 --> 00:21:57,410 Now, if we did that, and this was our table to start with, 449 00:21:57,410 --> 00:21:59,326 you can probably guess what's going to happen. 450 00:21:59,326 --> 00:22:02,420 After that query executes, the row whose username is newman 451 00:22:02,420 --> 00:22:06,020 is no longer in our table. 452 00:22:06,020 --> 00:22:09,170 Now, all of the operations that we just showed 453 00:22:09,170 --> 00:22:12,924 are really easy to do in the graphical interface of phpMyAdmin, 454 00:22:12,924 --> 00:22:15,090 which, again, comes with most implementations of SQL 455 00:22:15,090 --> 00:22:16,574 that you might download. 456 00:22:16,574 --> 00:22:18,740 Here's the catch, though-- phpMyAdmin, despite being 457 00:22:18,740 --> 00:22:22,640 very user-friendly to use, requires manual intervention. 458 00:22:22,640 --> 00:22:25,640 You're going to have to log in and click on some buttons 459 00:22:25,640 --> 00:22:27,740 to delete information or update it or change it. 460 00:22:27,740 --> 00:22:29,009 And we don't want to do that. 461 00:22:29,009 --> 00:22:30,050 We're a webmaster, right? 462 00:22:30,050 --> 00:22:31,970 We want things to happen automatically for us. 463 00:22:31,970 --> 00:22:35,147 We want programs that we write to make those changes for us. 464 00:22:35,147 --> 00:22:37,730 So we don't want to go into phpMyAdmin and make those changes. 465 00:22:37,730 --> 00:22:40,490 We have to write code that does it for us. 466 00:22:40,490 --> 00:22:43,430 Fortunately, SQL integrates really nicely 467 00:22:43,430 --> 00:22:46,760 with a lot of modern programming languages such as Python or PHP 468 00:22:46,760 --> 00:22:50,570 that have functions that you can use to connect to your database. 469 00:22:50,570 --> 00:22:55,040 And then the programming language has other functions 470 00:22:55,040 --> 00:22:58,220 that will query the database for you and make those changes. 471 00:22:58,220 --> 00:23:01,430 Now, we're going to leave off for now exactly how those languages do that. 472 00:23:01,430 --> 00:23:04,007 We'll save that for a video on those languages themselves. 473 00:23:04,007 --> 00:23:05,840 But know that that is something that you can 474 00:23:05,840 --> 00:23:08,840 do to avoid having to do any sort of manual intervention 475 00:23:08,840 --> 00:23:13,960 when you want to update or do any sort of operation at all on your databases. 476 00:23:13,960 --> 00:23:15,220 My name is Doug Lloyd. 477 00:23:15,220 --> 00:23:17,190 This is CS50. 478 00:23:17,190 --> 00:23:18,591 41347

Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.