Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
1
00:00:00,550 --> 00:00:01,210
All right guys.
2
00:00:01,240 --> 00:00:07,870
So we're going to get started learning about databases and how to actually persist or user data.
3
00:00:07,870 --> 00:00:09,730
Now here's the problem.
4
00:00:09,730 --> 00:00:16,540
Previously in the last challenge when we built our blog website, you would have noticed that every single
5
00:00:16,540 --> 00:00:19,500
time that you've restarted your server
6
00:00:19,540 --> 00:00:27,190
so for example at the moment, I'm using node app.js to run my app.js file and my server is
7
00:00:27,190 --> 00:00:28,740
running on port 3000.
8
00:00:28,840 --> 00:00:38,680
But if I use Control + C to stop running and then I rerun my app now, if I head back to my blog the post
9
00:00:38,680 --> 00:00:44,000
that I added previously will now be removed as soon as I hit refresh.
10
00:00:44,210 --> 00:00:51,940
And the reason behind this is because inside app.js, we're using an array to store all of that post
11
00:00:52,000 --> 00:00:52,880
data.
12
00:00:53,110 --> 00:00:59,130
And we don't have any way of persistently storing the data that the user generates
13
00:00:59,170 --> 00:01:04,650
for example new blog posts or new comments or new users sign ups.
14
00:01:04,780 --> 00:01:09,010
Now it's not just our website that's generating data.
15
00:01:09,040 --> 00:01:16,150
If you take a look at internetlivestats.com, you can see just how much data is being generated as
16
00:01:16,150 --> 00:01:17,380
we speak.
17
00:01:17,380 --> 00:01:23,620
If you look at the total number of Google searches today not even you know of all time, you can see how
18
00:01:23,620 --> 00:01:28,970
quickly that number moves and how many emails are sent or how many blog posts were written today.
19
00:01:29,290 --> 00:01:36,370
It's an absolutely huge amount of data that's being generated all across the Internet and they all need
20
00:01:36,370 --> 00:01:38,480
to be saved somewhere.
21
00:01:38,560 --> 00:01:41,070
That's why we need to know about databases.
22
00:01:41,320 --> 00:01:46,530
Once you realize the need for databases, the next step is actually choosing one.
23
00:01:46,600 --> 00:01:52,840
And if you look on the Internet this is actually not an easy task because there are so many databases
24
00:01:52,870 --> 00:01:58,930
out there. Even if you narrow it down to the databases that work well with Node.js, you still left
25
00:01:58,930 --> 00:02:06,360
with a lot of choice. And very frequently I'll come across blog posts have a title that's something like this:
26
00:02:06,370 --> 00:02:14,450
Cassandra vs MongoDB versus CouchDB vs... and it's just crazy the amount of choice that you have.
27
00:02:14,620 --> 00:02:19,660
But there's a reason for that because depending on the type of data you're looking to store and the
28
00:02:19,660 --> 00:02:26,440
structure of your data, you might favor one of these databases over another. But no matter what choice
29
00:02:26,440 --> 00:02:27,100
you make
30
00:02:27,100 --> 00:02:33,270
the main difference between databases is whether if they are sequel based or if they are no
31
00:02:33,280 --> 00:02:35,120
SQL based. Now
32
00:02:35,230 --> 00:02:43,480
SQL or sequel stands for Structured Query Language and it's a really old school piece of technology
33
00:02:43,510 --> 00:02:47,670
that businesses have been using for many many years. Now
34
00:02:47,680 --> 00:02:48,180
NoSQL
35
00:02:48,190 --> 00:02:55,450
stands for Not Only Structured Query Language so they can actually be any format as long as it's
36
00:02:55,450 --> 00:03:00,960
not SQL although most of them follow a very similar structure.
37
00:03:00,970 --> 00:03:05,380
Those are the two big families of databases that you get to choose from.
38
00:03:05,470 --> 00:03:10,480
And we're going to talk about the differences between each and I'm going to show you how to work with
39
00:03:10,480 --> 00:03:12,600
both of these types of databases.
40
00:03:13,120 --> 00:03:19,660
If you do a quick google, you will find that there are so many different types of databases that are
41
00:03:19,660 --> 00:03:23,500
categorized as SQUL databases or noSQL.
42
00:03:23,500 --> 00:03:29,760
But the top four most popular databases at least when we're working with Node.js, for the sequel family
43
00:03:29,770 --> 00:03:34,140
that's MySQL and Postgres. And for noSQL
44
00:03:34,180 --> 00:03:41,230
it's going to be MongoDB and redis. Now some of these databases you might have already heard of but
45
00:03:41,560 --> 00:03:47,890
a lot of people are confused as to when to use which and what are the similarities and differences and
46
00:03:47,890 --> 00:03:49,720
how do you choose anyways.
47
00:03:50,930 --> 00:03:55,120
So let's talk about what are the differences between SQL and NOSQL.
48
00:03:55,130 --> 00:04:02,840
The first difference is in their structure. So let's say that you're starting a new business and
49
00:04:02,870 --> 00:04:08,270
you've decided to start selling stationery. Now because you're shipping a lot of your product to your
50
00:04:08,270 --> 00:04:09,120
customers
51
00:04:09,290 --> 00:04:12,850
you need to keep a record of their names and addresses.
52
00:04:12,980 --> 00:04:15,300
So you create a database for that.
53
00:04:15,500 --> 00:04:21,800
And if you used to excel and creating tables then you'll find that a sequel database will serve you
54
00:04:21,800 --> 00:04:23,510
quite well.
55
00:04:23,510 --> 00:04:27,310
A sequel database will group your data into tables.
56
00:04:27,380 --> 00:04:33,490
So this is how our customer data would look if we chose a sequel database such as mySQL or
57
00:04:33,500 --> 00:04:41,300
Postgres. But sometimes you might find that sequel databases can be extremely inflexible.
58
00:04:41,330 --> 00:04:48,740
For example let's say that in your shop you have me as a customer and I'm one of those annoying customers
59
00:04:48,830 --> 00:04:52,100
who don't really like having mail sent to my address.
60
00:04:52,100 --> 00:04:54,840
So I say to you, 'You know what?
61
00:04:54,890 --> 00:04:57,500
Why don't you deliver to my address
62
00:04:57,530 --> 00:05:03,890
but for all other communications I want you to just email me? Because we are after all in the 21st
63
00:05:03,890 --> 00:05:05,250
century right?
64
00:05:05,270 --> 00:05:12,500
If you had a sequel table then you have a bit of a problem because that doesn't really fit with your
65
00:05:12,500 --> 00:05:13,480
table.
66
00:05:13,490 --> 00:05:17,910
You don't have a pre-specified email column to fit that data
67
00:05:17,910 --> 00:05:22,120
and none of your existing data have an email entry.
68
00:05:22,550 --> 00:05:27,770
And what if you had a customer who didn't want to give you any information at all?
69
00:05:27,770 --> 00:05:33,290
They didn't want to give you their address or their email and they'd rather protect their privacy.
70
00:05:33,320 --> 00:05:40,700
Well then you kind of have a gap in your table and it ends up looking very very irregular and that doesn't
71
00:05:40,700 --> 00:05:44,050
sit very well with sequel databases.
72
00:05:44,190 --> 00:05:53,010
So if you chose a sequel database, what it will try to do is it will try to square off the table. So you
73
00:05:53,010 --> 00:05:55,300
can add a new column called email
74
00:05:55,530 --> 00:06:02,060
but for all the places where there's missing information it will automatically insert a null.
75
00:06:02,190 --> 00:06:08,880
And as we learned from previous programming lessons null can be very dangerous especially if you're
76
00:06:08,880 --> 00:06:14,070
not expecting something to be null or not have any data.
77
00:06:14,100 --> 00:06:19,110
For example if you decide to send out your marketing emails, a lot of those emails are going to be
78
00:06:19,110 --> 00:06:28,400
going to an email address that's called null right? But if you had a NoSQL based database such as
79
00:06:28,490 --> 00:06:32,210
MongoDB, then this wouldn't be a problem at all.
80
00:06:33,120 --> 00:06:38,870
If you were working with Mongo the same data would be represented as JSON objects.
81
00:06:39,150 --> 00:06:43,750
So you would have key value pairs such as first name, Last name, address
82
00:06:43,920 --> 00:06:50,730
and if in one of those documents you had an extra key value pair such as email then it doesn't affect
83
00:06:50,790 --> 00:06:57,630
any of the other data records. And none of these records or documents actually have to be the same shape
84
00:06:57,960 --> 00:06:59,730
or follow the same structure.
85
00:07:00,960 --> 00:07:06,930
If you're old enough to remember that ad with a PC Guy and Mac guy, it's kind of the same with SQL and
86
00:07:06,930 --> 00:07:14,910
NoSQL. NoSQL databases tend to be flashier, newer with modern syntax and modern methods
87
00:07:15,270 --> 00:07:20,460
but SQL database is old and reliable and it likes structure.
88
00:07:20,550 --> 00:07:26,550
Everybody knows that guy who they work with in the office who really likes structure and for everybody
89
00:07:26,550 --> 00:07:32,680
to follow rules and to specify a structure ahead of time and just stick to it.
90
00:07:32,970 --> 00:07:39,420
But with NoSQL and especially MongoDB you're more flexible. You're able to adapt and maneuver around
91
00:07:39,420 --> 00:07:43,070
the situation as and when things come up.
92
00:07:43,110 --> 00:07:48,480
So it's really helpful for startups where your data structure is not predefined.
93
00:07:48,510 --> 00:07:53,190
You know one day you could be selling toasters and tomorrow you could be pivoting to giving massages.
94
00:07:53,480 --> 00:07:59,950
Then a NoSQL database can remain flexible and you're not bound to a particular structure.
95
00:08:01,140 --> 00:08:09,300
But even though the NoSQL guy is flashy and he's new and he's hip, you know I'm sure you've
96
00:08:09,300 --> 00:08:12,330
met people like this and I've met a lot of NoSQL
97
00:08:12,330 --> 00:08:20,340
guys in my time. And what they're not good at are relationships and this is another big difference
98
00:08:20,400 --> 00:08:21,670
between SQL
99
00:08:21,690 --> 00:08:27,440
and NoSQL databases in a way that they implement relationships between your data.
100
00:08:27,750 --> 00:08:30,760
Although previously we were comparing between SQL
101
00:08:30,770 --> 00:08:38,929
and NoSQL databases, you also often hear people refer to them as relational and non-relational databases.
102
00:08:39,150 --> 00:08:45,810
And this is the highlight that SQL databases of really good at establishing relationships between
103
00:08:45,870 --> 00:08:50,040
your data. Coming back to our shop,
104
00:08:50,060 --> 00:08:54,990
let's say that you wanted to keep a record of all of the orders that are being made.
105
00:08:55,040 --> 00:09:00,190
So you want to know what's the name of the customer, who bought it, what's their address,
106
00:09:00,260 --> 00:09:01,490
what did they buy,
107
00:09:01,490 --> 00:09:03,270
what was the price of the product,
108
00:09:03,290 --> 00:09:05,300
how many of it did they buy,
109
00:09:05,300 --> 00:09:06,320
all of these things.
110
00:09:06,380 --> 00:09:13,520
But it doesn't really make sense to cram it all into the same table. Especially if that same customer
111
00:09:13,550 --> 00:09:16,970
comes back and buys a different product
112
00:09:17,150 --> 00:09:24,170
then you have a lot of repeated data and it doesn't make any sense to cram everything into the same table
113
00:09:24,170 --> 00:09:24,750
right?
114
00:09:25,070 --> 00:09:31,310
Using a SQL database what you would do is you would group related pieces of data into individual
115
00:09:31,310 --> 00:09:32,090
tables.
116
00:09:32,420 --> 00:09:39,650
So you might have a customer table, a products table and an orders table. And in your database you can
117
00:09:39,650 --> 00:09:43,430
specify a relationship between the tables.
118
00:09:43,430 --> 00:09:51,230
So for example you could link up the customers table through the customer ID to the orders table. And
119
00:09:51,230 --> 00:09:56,080
you could link the products table to the products ID in the orders table.
120
00:09:56,180 --> 00:10:01,960
If we had some real data for example, we have a database of all of our customers and where they live.
121
00:10:02,180 --> 00:10:06,430
We have a table of our products and what their prices are.
122
00:10:06,650 --> 00:10:12,230
And we also have this Orders table which links up the data in the different tables.
123
00:10:12,590 --> 00:10:19,670
Let's say we have an order with order ID 1, our very first order. We can have a column where we specify
124
00:10:19,670 --> 00:10:21,310
the customer ID.
125
00:10:21,500 --> 00:10:25,480
So in this case it's linked to the customer that has an ID of 2.
126
00:10:25,580 --> 00:10:28,240
So that is of course me
127
00:10:28,250 --> 00:10:28,640
right?
128
00:10:28,640 --> 00:10:35,130
cUSTOMER 2 is Angela Yu. Now the next column saves the product ID.
129
00:10:35,300 --> 00:10:44,990
So this links to the products table with the product that has an ID of 2. By using this format we can
130
00:10:44,990 --> 00:10:49,400
join the tables together at a later point if need be.
131
00:10:49,700 --> 00:10:55,910
So we can know for each order which customer bought it and which products were bought.
132
00:10:57,480 --> 00:11:01,800
Now using a NoSQL database such as MongoDB
133
00:11:01,950 --> 00:11:05,180
you have to rethink how you organize your data.
134
00:11:05,340 --> 00:11:10,450
In this case we might have a document where we have an order ID.
135
00:11:10,800 --> 00:11:19,050
We have a key value pair that's a customer with customer as the key and their details as the value
136
00:11:19,920 --> 00:11:24,190
and also the products as its own embedded objects.
137
00:11:24,540 --> 00:11:30,180
So now we have embedded objects in order to represent the entire order.
138
00:11:30,270 --> 00:11:33,340
And this might lead to some data repetition.
139
00:11:34,250 --> 00:11:38,650
So you could also organize your data in different documents
140
00:11:38,660 --> 00:11:39,230
right?
141
00:11:39,260 --> 00:11:47,180
So say you have three objects. One has details of the order, one has details of the customer and one has
142
00:11:47,180 --> 00:11:51,280
details of the product. And you can also link them together
143
00:11:51,600 --> 00:11:55,650
although a little bit more clumsily through using references.
144
00:11:55,670 --> 00:12:03,650
So for example in the order document, you can see the customer references the document with a person
145
00:12:03,650 --> 00:12:10,900
ID of person_01 and the product references a product with the ID of product_01.
146
00:12:10,910 --> 00:12:18,380
So this is how you might implement your relationships in a NoSQL database such as MongoDB. But
147
00:12:18,470 --> 00:12:24,080
it's a little bit clumsy and when you start searching this database that structured like this, it might
148
00:12:24,170 --> 00:12:29,830
not be quite as efficient or fast as a SQL database.
149
00:12:30,060 --> 00:12:37,220
So depending on the structure of your data, you will want to choose a SQL or a NoSQL database.
150
00:12:37,440 --> 00:12:45,330
And you should know very often you will see on the Internet people saying MongoDB is the best database
151
00:12:45,420 --> 00:12:46,810
out there ever.
152
00:12:47,010 --> 00:12:52,810
And other people will say don't ever use MongoDB, it's the worst. Or you know
153
00:12:52,810 --> 00:12:55,670
MySQL is the best thing since sliced bread.
154
00:12:55,710 --> 00:12:57,670
Don't believe what people say.
155
00:12:57,750 --> 00:13:03,370
Instead you have to know what your particular situation is to choose the right database.
156
00:13:03,660 --> 00:13:11,100
If you need to store data such as orders, customer details, products inventory, things that have lots of
157
00:13:11,100 --> 00:13:16,650
relationships between each other, then you might be better off choosing a SQL database such as
158
00:13:16,650 --> 00:13:17,770
MySQL.
159
00:13:17,850 --> 00:13:24,230
However if you have a website where you have something that's more of a one to many kind of relationship
160
00:13:24,240 --> 00:13:31,440
like here for example say if we were making a database for Instagram, then we might have a user name,
161
00:13:31,500 --> 00:13:36,000
a first name, last name, and then we might have a post array.
162
00:13:36,090 --> 00:13:43,230
And so this is the array of objects and those objects each have an image URL description and the
163
00:13:43,230 --> 00:13:45,280
date that they were created.
164
00:13:45,360 --> 00:13:53,100
In this case where you have a single user generating lots of content creating a one user to many posts
165
00:13:53,220 --> 00:14:01,350
relationship, then this is much easier to map out using something like MongoDB. The last difference
166
00:14:01,350 --> 00:14:07,970
I want to talk about is scalability. Despite it being relatively weak on the relationship front
167
00:14:08,010 --> 00:14:14,060
one of the reasons why you might use NoSQL over SQL is because of scalability.
168
00:14:14,370 --> 00:14:20,720
This is a real strong point of NoSQL databases and this is a reason why you might choose a NoSQL
169
00:14:20,720 --> 00:14:23,650
database over a SQL database.
170
00:14:23,820 --> 00:14:30,630
If you've ever had an Excel spreadsheet that has lots and lots of rows, so say 10,000+ rows
171
00:14:31,080 --> 00:14:33,050
then you might have experienced this.
172
00:14:33,120 --> 00:14:35,770
It gets slower and slower
173
00:14:35,790 --> 00:14:42,630
the more rows of data you add until a point where the file is so big that your computer can no longer
174
00:14:42,630 --> 00:14:49,170
handle it and you might get a pop up saying Excel cannot complete this task with available resources.
175
00:14:49,170 --> 00:14:52,240
Choose less data or close other applications.
176
00:14:52,320 --> 00:14:59,430
It's saying that you need a more powerful computer. If you are running your SQL database on your laptop
177
00:14:59,430 --> 00:15:00,400
for example
178
00:15:00,480 --> 00:15:06,900
then you might just get a more powerful computer, say a desktop computer with higher specs
179
00:15:06,960 --> 00:15:07,390
right?
180
00:15:07,530 --> 00:15:13,830
And once that start slowing down and struggling, then you might just go for a mainframe and buy the biggest
181
00:15:13,830 --> 00:15:15,930
computer that you can find.
182
00:15:15,960 --> 00:15:22,020
But as we saw previously there are some companies that have so much data.
183
00:15:22,200 --> 00:15:29,430
For example the number of websites that Google has to index and be able to serve up in searches is
184
00:15:29,430 --> 00:15:37,470
in the bazillions and they have so much data that even the fastest computer will slow down and struggle.
185
00:15:38,220 --> 00:15:42,770
Managing large amounts of data using SQL based databases
186
00:15:42,840 --> 00:15:47,210
it's kind of like scaling a building vertically. As soon as you have more data
187
00:15:47,220 --> 00:15:53,820
you just add floors onto your building until eventually you end up with the skyscraper of a building.
188
00:15:53,880 --> 00:15:58,020
It might just become unstable and end up collapsing on you.
189
00:15:58,290 --> 00:16:05,730
And it's also very costly to scale vertically like this. Now because the way that MongoDB organizes
190
00:16:05,730 --> 00:16:13,650
its data into smaller chunks or smaller documents of data where essentially each record or each row
191
00:16:13,650 --> 00:16:17,510
in the table is represented as a JSON object,
192
00:16:17,700 --> 00:16:25,280
then in this case instead of buying larger and larger computers, it allows for a distributed system.
193
00:16:25,410 --> 00:16:30,350
So your database can be distributed amongst lots and lots of different computers.
194
00:16:30,460 --> 00:16:37,230
And this is equivalent to building low but sturdy buildings and you scale horizontally so you build
195
00:16:37,230 --> 00:16:43,160
lots and lots of those. And all the data is distributed and spread out amongst all of them.
196
00:16:44,210 --> 00:16:50,810
So if we were to compare the most popular SQL database which is MySQL against the most popular
197
00:16:51,230 --> 00:16:58,790
NoSQL database which is MongoDB, then these are some of the pros and cons of each. SQL databases
198
00:16:59,000 --> 00:17:02,040
have been around for ages.
199
00:17:02,210 --> 00:17:05,300
That means that the technology is more mature.
200
00:17:05,569 --> 00:17:11,390
But also if you come across anything that's SQL related on the internet, it all looks like it was created
201
00:17:11,420 --> 00:17:13,109
in the 90s.
202
00:17:13,490 --> 00:17:21,260
MongoDB is shiny and new but as we're finding out some of the things are not really ironed out. That
203
00:17:21,260 --> 00:17:27,460
might mean more frequent changes in the technology as they find new problems and address them.
204
00:17:27,650 --> 00:17:34,430
The next difference is that SQL databases run on a table structure as we've seen before whereas Mongo
205
00:17:34,430 --> 00:17:37,290
DB works with a document structure.
206
00:17:37,490 --> 00:17:43,190
So this way individual rows of data can be each represented as a document.
207
00:17:43,310 --> 00:17:47,180
And this leads us onto the next point. With a SQL database
208
00:17:47,180 --> 00:17:51,770
you should have a structure in mind before you create your database.
209
00:17:51,920 --> 00:17:55,550
And this, in database lingo, is called a schema.
210
00:17:55,580 --> 00:18:01,450
So these are the names of your columns and the names of your tables and how many columns you might need.
211
00:18:02,120 --> 00:18:04,900
And SQL databases work really well
212
00:18:04,940 --> 00:18:10,940
if you have a fixed schema, so if you already know ahead of time what your customer database will look
213
00:18:10,940 --> 00:18:13,230
like, what your orders database will look like
214
00:18:13,400 --> 00:18:15,910
and you know that they're not likely to change.
215
00:18:16,250 --> 00:18:22,040
Whereas with something like MongoDB and other NoSQL databases, they're more flexible to changes.
216
00:18:22,220 --> 00:18:27,170
So if you're a young startup and you haven't really figured out how you're going to organize your data
217
00:18:27,470 --> 00:18:32,960
or what kind of data you're going to store, then it might be a better idea to choose this particular
218
00:18:32,960 --> 00:18:34,220
type of database.
219
00:18:34,580 --> 00:18:40,640
But as we saw earlier on, if you need to store data with a lot of relationships between the data then
220
00:18:40,760 --> 00:18:46,320
a SQL database will work out to be much faster when you query through your database.
221
00:18:46,670 --> 00:18:53,240
Whereas with something like MongoDB it can manage simple relationships but it's really not great when
222
00:18:53,240 --> 00:19:00,080
those relationships start to get complex and when you have data in MongoDB that's linked up with complex
223
00:19:00,080 --> 00:19:01,090
relationships
224
00:19:01,250 --> 00:19:08,470
The speed and efficiency can quickly go down. The last point as we mentioned earlier on is that SQL
225
00:19:08,470 --> 00:19:13,300
databases tend to scale vertically. So you tend to need more infrastructure,
226
00:19:13,510 --> 00:19:14,930
it's more costly.
227
00:19:15,100 --> 00:19:21,160
Whereas with MongoDB and other NoSQL databases they can scale horizontally and allows for a
228
00:19:21,160 --> 00:19:28,430
more distributed structure which makes it much easier to scale.
229
00:19:28,440 --> 00:19:34,320
So this is your introduction for working with databases and especially looking at the differences between
230
00:19:34,320 --> 00:19:36,410
the two major families, SQL
231
00:19:36,460 --> 00:19:42,430
and NoSQL databases. Now in the coming lessons we're going to explore in a more hands on way
232
00:19:42,660 --> 00:19:48,930
the differences between SQL and NoSQL so that you can try it out and see for yourself how they
233
00:19:48,930 --> 00:19:54,640
each work and experience some of the frustrations as well as the benefits of each.
234
00:19:54,810 --> 00:19:57,420
So for all of that and more, I'll see on the next lesson.
25597
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.