All language subtitles for 3. Principles of Database Normalization

af Afrikaans
sq Albanian
am Amharic
ar Arabic Download
hy Armenian
az Azerbaijani
eu Basque
be Belarusian
bn Bengali
bs Bosnian
bg Bulgarian
ca Catalan
ceb Cebuano
ny Chichewa
zh-CN Chinese (Simplified)
zh-TW Chinese (Traditional)
co Corsican
hr Croatian
cs Czech
da Danish
nl Dutch
en English
eo Esperanto
et Estonian
tl Filipino
fi Finnish
fr French
fy Frisian
gl Galician
ka Georgian
de German
el Greek
gu Gujarati
ht Haitian Creole
ha Hausa
haw Hawaiian
iw Hebrew
hi Hindi
hmn Hmong
hu Hungarian
is Icelandic
ig Igbo
id Indonesian
ga Irish
it Italian
ja Japanese
jw Javanese
kn Kannada
kk Kazakh
km Khmer
ko Korean
ku Kurdish (Kurmanji)
ky Kyrgyz
lo Lao
la Latin
lv Latvian
lt Lithuanian
lb Luxembourgish
mk Macedonian
mg Malagasy
ms Malay
ml Malayalam
mt Maltese
mi Maori
mr Marathi
mn Mongolian
my Myanmar (Burmese)
ne Nepali
no Norwegian
ps Pashto
fa Persian
pl Polish
pt Portuguese
pa Punjabi
ro Romanian
ru Russian
sm Samoan
gd Scots Gaelic
sr Serbian
st Sesotho
sn Shona
sd Sindhi
si Sinhala
sk Slovak
sl Slovenian
so Somali
es Spanish
su Sundanese
sw Swahili
sv Swedish
tg Tajik
ta Tamil
te Telugu
th Thai
tr Turkish
uk Ukrainian
ur Urdu
uz Uzbek
vi Vietnamese
cy Welsh
xh Xhosa
yi Yiddish
yo Yoruba
zu Zulu
or Odia (Oriya)
rw Kinyarwanda
tk Turkmen
tt Tatar
ug Uyghur
Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 00:00:00,480 --> 00:00:00,780 All right. 2 00:00:00,780 --> 00:00:05,640 Time to talk about something really really important called normalization. 3 00:00:05,850 --> 00:00:10,860 Now this is a tricky one to wrap your head around at first so I encourage you to re watch this video 4 00:00:11,160 --> 00:00:14,000 as many times as it takes until it starts to really stick. 5 00:00:14,220 --> 00:00:21,410 So by definition normalization is the process of organizing the tables and columns in a relational database 6 00:00:21,870 --> 00:00:25,520 to reduce redundancy and preserve data integrity. 7 00:00:25,620 --> 00:00:30,290 So a lot of fancy words they're kind of tough to understand what that really means. 8 00:00:30,510 --> 00:00:33,900 But basically it's used to do three different things. 9 00:00:33,900 --> 00:00:40,230 Number one eliminate redundant data which helps to decrease table sizes and more importantly reduce 10 00:00:40,230 --> 00:00:43,500 processing speed and improve efficiency. 11 00:00:43,500 --> 00:00:49,140 Number two helps us minimize errors and anomalies when we make data modifications. 12 00:00:49,140 --> 00:00:53,240 So if we're to insert or update or delete records in our database. 13 00:00:53,460 --> 00:01:00,030 And number three it helps simplify queries and structure the database in a way that enables meaningful 14 00:01:00,090 --> 00:01:01,770 useful analysis. 15 00:01:01,770 --> 00:01:04,620 So still feels kind of over complicated. 16 00:01:04,680 --> 00:01:05,780 If you asked me. 17 00:01:05,790 --> 00:01:10,910 So my tip to remember what normalization is all about is to think of it this way. 18 00:01:10,980 --> 00:01:18,060 In a properly normalized database every table should serve it distinct and specific purpose. 19 00:01:18,180 --> 00:01:23,880 So you might have one table that only gives you information about products you have another that only 20 00:01:23,880 --> 00:01:27,090 gives you information about dates like a calendar table. 21 00:01:27,330 --> 00:01:33,610 You might have one that's only daily Transactional Records and another that's only about customers. 22 00:01:33,660 --> 00:01:38,940 Now this should sound pretty familiar because these are the exact type of tables that we're using here 23 00:01:39,000 --> 00:01:41,070 in this adventure works demo. 24 00:01:41,070 --> 00:01:47,070 So let me take a stab at visualizing why normalization is such an important concept consider a table 25 00:01:47,070 --> 00:01:48,150 like this. 26 00:01:48,150 --> 00:01:55,110 You've got transaction quantities here in the third column broken down by product ID and by date as 27 00:01:55,110 --> 00:02:02,430 well as all of this extra information about each product ID the brand the name the skew and the weight. 28 00:02:02,430 --> 00:02:08,100 And as you can see just from this small sample that we have multiple transactions or multiple quantity 29 00:02:08,100 --> 00:02:13,440 values per day and multiple quantity values per product ID. 30 00:02:13,530 --> 00:02:16,120 So this table is not normalized. 31 00:02:16,350 --> 00:02:19,020 It doesn't serve a single unique purpose. 32 00:02:19,020 --> 00:02:25,230 It's actually serving at least two purposes one providing the transaction quantity by date and product 33 00:02:25,230 --> 00:02:30,560 ID and to providing additional attributes about those products. 34 00:02:30,690 --> 00:02:32,910 Those are two different purposes. 35 00:02:32,970 --> 00:02:37,060 So what you end up with here are all of these duplicate rows. 36 00:02:37,080 --> 00:02:41,250 In any case where the same product ID appears more than once. 37 00:02:41,310 --> 00:02:46,420 So you see duplicate brand names product names duplicate Skewes and product weights. 38 00:02:46,760 --> 00:02:49,730 And you might be wondering OK that's not that big a deal. 39 00:02:49,740 --> 00:02:54,330 I'm still getting the information that I need in fact to have it all in one place in a single table 40 00:02:54,330 --> 00:02:54,990 which is great. 41 00:02:54,990 --> 00:02:56,960 So I don't see the downside here. 42 00:02:57,270 --> 00:03:03,480 Well imagine if we were dealing with 100 different products and each of those products on average sold 43 00:03:03,490 --> 00:03:05,160 10000 times a day. 44 00:03:05,310 --> 00:03:11,360 Now all of a sudden you're talking about a million duplicate rows for every single date in the data 45 00:03:11,360 --> 00:03:12,220 set. 46 00:03:12,300 --> 00:03:18,450 So you can see that with larger more complex models minor inefficiencies like this can become major 47 00:03:18,450 --> 00:03:19,690 major problems. 48 00:03:19,770 --> 00:03:21,710 As you scale up in size. 49 00:03:21,780 --> 00:03:28,830 So the way to avoid issues like this is to strip those product attribute columns out of this table and 50 00:03:28,830 --> 00:03:32,180 create a relationship with a single product Look-Up. 51 00:03:32,330 --> 00:03:38,580 And if that product look up contains a unique list of product IDs with those associated attributes then 52 00:03:38,580 --> 00:03:41,370 we can access the exact same information here. 53 00:03:41,550 --> 00:03:44,860 Well eliminating every one of those duplicate rows. 54 00:03:44,880 --> 00:03:50,070 So again this concept may still feel a little bit ambiguous but trust me we're going to get our hands 55 00:03:50,070 --> 00:03:50,730 dirty. 56 00:03:50,730 --> 00:03:55,140 We're going to do a ton of demos walk through a bunch of samples and this is going to start to feel 57 00:03:55,140 --> 00:03:58,970 much more natural as we continue through this section of the course. 58 00:03:58,980 --> 00:04:04,500 So next up we're going to talk about data tables and look up tables as our first step towards building 59 00:04:04,800 --> 00:04:06,510 a properly normalized model. 6118

Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.