All language subtitles for 003 Database Normalization_en

af Afrikaans
ak Akan
sq Albanian
am Amharic
ar Arabic
hy Armenian
az Azerbaijani
eu Basque
be Belarusian
bem Bemba
bn Bengali
bh Bihari
bs Bosnian
br Breton
bg Bulgarian
km Cambodian
ca Catalan
ceb Cebuano
chr Cherokee
ny Chichewa
zh-CN Chinese (Simplified)
zh-TW Chinese (Traditional)
co Corsican
hr Croatian
cs Czech
da Danish
nl Dutch
en English
eo Esperanto
et Estonian
ee Ewe
fo Faroese
tl Filipino
fi Finnish
fr French Download
fy Frisian
gaa Ga
gl Galician
ka Georgian
de German
el Greek
gn Guarani
gu Gujarati
ht Haitian Creole
ha Hausa
haw Hawaiian
iw Hebrew
hi Hindi
hmn Hmong
hu Hungarian
is Icelandic
ig Igbo
id Indonesian
ia Interlingua
ga Irish
it Italian
ja Japanese
jw Javanese
kn Kannada
kk Kazakh
rw Kinyarwanda
rn Kirundi
kg Kongo
ko Korean
kri Krio (Sierra Leone)
ku Kurdish
ckb Kurdish (Soranî)
ky Kyrgyz
lo Laothian
la Latin
lv Latvian
ln Lingala
lt Lithuanian
loz Lozi
lg Luganda
ach Luo
lb Luxembourgish
mk Macedonian
mg Malagasy
ms Malay
ml Malayalam
mt Maltese
mi Maori
mr Marathi
mfe Mauritian Creole
mo Moldavian
mn Mongolian
my Myanmar (Burmese)
sr-ME Montenegrin
ne Nepali
pcm Nigerian Pidgin
nso Northern Sotho
no Norwegian
nn Norwegian (Nynorsk)
oc Occitan
or Oriya
om Oromo
ps Pashto
fa Persian
pl Polish
pt-BR Portuguese (Brazil)
pt Portuguese (Portugal)
pa Punjabi
qu Quechua
ro Romanian
rm Romansh
nyn Runyakitara
ru Russian
sm Samoan
gd Scots Gaelic
sr Serbian
sh Serbo-Croatian
st Sesotho
tn Setswana
crs Seychellois Creole
sn Shona
sd Sindhi
si Sinhalese
sk Slovak
sl Slovenian
so Somali
es Spanish
es-419 Spanish (Latin American)
su Sundanese
sw Swahili
sv Swedish
tg Tajik
ta Tamil
tt Tatar
te Telugu
th Thai
ti Tigrinya
to Tonga
lua Tshiluba
tum Tumbuka
tr Turkish
tk Turkmen
tw Twi
ug Uighur
uk Ukrainian
ur Urdu
uz Uzbek
vi Vietnamese
cy Welsh
wo Wolof
xh Xhosa
yi Yiddish
yo Yoruba
zu Zulu
Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 00:00:00,000 --> 00:00:01,000 Instructor: All right, let's talk about a concept 2 00:00:01,000 --> 00:00:03,000 called normalization. 3 00:00:03,000 --> 00:00:06,000 Now, this is a really important topic. 4 00:00:06,000 --> 00:00:08,000 It's gonna help you grasp exactly why 5 00:00:08,000 --> 00:00:10,000 and how data models actually work, 6 00:00:10,000 --> 00:00:11,000 but it can be a little bit tricky 7 00:00:11,000 --> 00:00:13,000 to wrap your head around at first. 8 00:00:15,000 --> 00:00:16,000 So the textbook definition here 9 00:00:16,000 --> 00:00:18,000 is that normalization is the process 10 00:00:18,000 --> 00:00:20,000 of organizing the tables and columns 11 00:00:20,000 --> 00:00:22,000 in a relational database 12 00:00:22,000 --> 00:00:25,000 to reduce redundancy and preserve data integrity. 13 00:00:25,000 --> 00:00:28,000 So what the heck does that even mean, right? 14 00:00:29,000 --> 00:00:31,000 Well, it's commonly used to do three different things. 15 00:00:31,000 --> 00:00:35,000 First, to eliminate redundant data in a model, 16 00:00:35,000 --> 00:00:37,000 to decrease your table sizes, and more importantly, 17 00:00:37,000 --> 00:00:40,000 improve processing speed and efficiency. 18 00:00:41,000 --> 00:00:43,000 It's also used to minimize errors and anomalies 19 00:00:43,000 --> 00:00:46,000 when you do things like insert, update, or delete, 20 00:00:46,000 --> 00:00:49,000 or otherwise modify data inside of your model. 21 00:00:50,000 --> 00:00:52,000 And third, it helps simplify queries 22 00:00:52,000 --> 00:00:55,000 and structure your database for meaningful analysis. 23 00:00:55,000 --> 00:00:57,000 And we'll see examples of this 24 00:00:57,000 --> 00:00:59,000 as we start writing DAX queries and measures 25 00:00:59,000 --> 00:01:01,000 later on in the course. 26 00:01:01,000 --> 00:01:03,000 So at this point, it still feels a little bit vague, right? 27 00:01:03,000 --> 00:01:05,000 A little ambiguous. 28 00:01:05,000 --> 00:01:07,000 So the way that I like to wrap my head around 29 00:01:07,000 --> 00:01:09,000 what normalization really means 30 00:01:09,000 --> 00:01:11,000 is to think of it this way, 31 00:01:11,000 --> 00:01:13,000 in a normalized database, 32 00:01:13,000 --> 00:01:16,000 each table should serve one distinct 33 00:01:16,000 --> 00:01:19,000 and specific purpose and one purpose only. 34 00:01:19,000 --> 00:01:22,000 So you may have a table that only contains information 35 00:01:22,000 --> 00:01:24,000 about individual products. 36 00:01:24,000 --> 00:01:25,000 You might have another table 37 00:01:25,000 --> 00:01:28,000 that only contains daily transaction records, 38 00:01:28,000 --> 00:01:30,000 another that's a calendar table 39 00:01:30,000 --> 00:01:32,000 with information about dates, 40 00:01:32,000 --> 00:01:34,000 and maybe some others that contain attributes 41 00:01:34,000 --> 00:01:37,000 about your customers, or your stores, and so on. 42 00:01:37,000 --> 00:01:38,000 And these should sound pretty familiar 43 00:01:38,000 --> 00:01:40,000 because these are the exact tables 44 00:01:40,000 --> 00:01:43,000 that we'll be working with in the project 45 00:01:43,000 --> 00:01:45,000 that we work through throughout this course. 46 00:01:45,000 --> 00:01:47,000 So let me take a stab at kind of bringing this to life, 47 00:01:47,000 --> 00:01:50,000 and visualizing the concept of normalization a bit. 48 00:01:51,000 --> 00:01:53,000 Imagine a table that looks like this. 49 00:01:53,000 --> 00:01:55,000 You've got transactional records here, right? 50 00:01:55,000 --> 00:01:57,000 You've got dates of transactions, 51 00:01:57,000 --> 00:02:00,000 product IDs that were sold, quantity of goods sold, 52 00:02:00,000 --> 00:02:03,000 but you also have product specific attributes 53 00:02:03,000 --> 00:02:04,000 stored here as well, 54 00:02:04,000 --> 00:02:06,000 like the product brand, the product name, 55 00:02:06,000 --> 00:02:08,000 the SKU, and weight, 56 00:02:08,000 --> 00:02:12,000 and because you have multiple transactions per day, 57 00:02:12,000 --> 00:02:13,000 and multiple transactions per product, 58 00:02:13,000 --> 00:02:17,000 you end up with a bunch of redundant duplicate fields 59 00:02:17,000 --> 00:02:18,000 or rows here. 60 00:02:19,000 --> 00:02:21,000 In other words, every time a specific product ID is sold, 61 00:02:21,000 --> 00:02:24,000 we're storing the same set of information 62 00:02:24,000 --> 00:02:26,000 about the brand, the name, the SKU, and the weight 63 00:02:26,000 --> 00:02:28,000 in row after row after row. 64 00:02:28,000 --> 00:02:30,000 And you might be looking at this and saying, 65 00:02:30,000 --> 00:02:32,000 "You know, what's the big deal? 66 00:02:32,000 --> 00:02:33,000 Like, I've got all the information that I need 67 00:02:33,000 --> 00:02:34,000 for my analysis. 68 00:02:35,000 --> 00:02:36,000 I've got it all in one place. 69 00:02:36,000 --> 00:02:37,000 So that's great, right? 70 00:02:38,000 --> 00:02:40,000 But suppose that instead of a handful of records, 71 00:02:40,000 --> 00:02:43,000 like we see here, you've got a hundred different products 72 00:02:43,000 --> 00:02:46,000 that sell an average of 10,000 times a day. 73 00:02:46,000 --> 00:02:50,000 Now, we're talking about upwards of 1 million duplicate rows 74 00:02:50,000 --> 00:02:52,000 for each date in the table. 75 00:02:53,000 --> 00:02:54,000 So the point is that minor inefficiencies 76 00:02:54,000 --> 00:02:58,000 like this can very quickly become major, major problems 77 00:02:58,000 --> 00:03:00,000 as you scale. 78 00:03:00,000 --> 00:03:02,000 And the way that we would solve a problem like this 79 00:03:02,000 --> 00:03:04,000 is to normalize the data, right? 80 00:03:04,000 --> 00:03:07,000 We'd take all those product specific attributes, 81 00:03:07,000 --> 00:03:09,000 brand names, SKU, and weight, 82 00:03:09,000 --> 00:03:10,000 and we would store those 83 00:03:10,000 --> 00:03:14,000 in its own separate table containing one row, 84 00:03:14,000 --> 00:03:17,000 or one unique record for each product ID 85 00:03:17,000 --> 00:03:20,000 that would essentially transform this one table 86 00:03:20,000 --> 00:03:22,000 into two tables that each serve one purpose. 87 00:03:22,000 --> 00:03:25,000 You'd have transactions with date, product ID, 88 00:03:25,000 --> 00:03:27,000 and quantity in one table. 89 00:03:28,000 --> 00:03:29,000 You'd have products in a second table, 90 00:03:29,000 --> 00:03:32,000 and you could use a relationship between those 91 00:03:32,000 --> 00:03:34,000 based on the product ID field 92 00:03:34,000 --> 00:03:37,000 to access the exact same information 93 00:03:37,000 --> 00:03:39,000 without any of the redundant data. 94 00:03:39,000 --> 00:03:40,000 So hopefully that helps, 95 00:03:40,000 --> 00:03:42,000 we're gonna walk through a ton of additional demos 96 00:03:42,000 --> 00:03:45,000 and examples to help really hammer this point home, 97 00:03:45,000 --> 00:03:47,000 but that's our introduction to the concept 98 00:03:47,000 --> 00:03:49,000 of database normalization. 7615

Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.