subtitlecat.com

All language subtitles for 3. Principles of Database Normalization

Afrikaans

Akan

Albanian

Amharic

Arabic Download

Armenian

Azerbaijani

Basque

Belarusian

Bemba

Bengali

Bihari

Bosnian

Breton

Bulgarian

Cambodian

Catalan

Cebuano

Cherokee

Chichewa

Chinese (Simplified)

Chinese (Traditional)

Corsican

Croatian

Czech

Danish

Dutch

English

Esperanto

Estonian

Ewe

Faroese

Filipino

Finnish

French

Frisian

Galician

Georgian

German

Greek

Guarani

Gujarati

Haitian Creole

Hausa

Hawaiian

Hebrew

Hindi

Hmong

Hungarian

Icelandic

Igbo

Indonesian

Interlingua

Irish

Italian

Japanese

Javanese

Kannada

Kazakh

Kinyarwanda

Kirundi

Kongo

Korean

Krio (Sierra Leone)

Kurdish

Kurdish (Soranî)

Kyrgyz

Laothian

Latin

Latvian

Lingala

Lithuanian

Lozi

Luganda

Luo

Luxembourgish

Macedonian

Malagasy

Malay

Malayalam

Maltese

Maori

Marathi

Mauritian Creole

Moldavian

Mongolian

Myanmar (Burmese)

Montenegrin

Nepali

Nigerian Pidgin

Northern Sotho

Norwegian

Norwegian (Nynorsk)

Occitan

Oriya

Oromo

Pashto

Persian

Polish

Portuguese (Brazil)

Portuguese (Portugal)

Punjabi

Quechua

Romanian

Romansh

Runyakitara

Russian

Samoan

Scots Gaelic

Serbian

Serbo-Croatian

Sesotho

Setswana

Seychellois Creole

Shona

Sindhi

Sinhalese

Slovak

Slovenian

Somali

Spanish

Spanish (Latin American)

Sundanese

Swahili

Swedish

Tajik

Tamil

Tatar

Telugu

Thai

Tigrinya

Tonga

Tshiluba

Tumbuka

Turkish

Turkmen

Twi

Uighur

Ukrainian

Urdu

Uzbek

Vietnamese

Welsh

Wolof

Xhosa

Yiddish

Yoruba

Zulu

Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 00:00:00,480 --> 00:00:00,780 All right. 2 00:00:00,780 --> 00:00:05,640 Time to talk about something really really important called normalization. 3 00:00:05,850 --> 00:00:10,860 Now this is a tricky one to wrap your head around at first so I encourage you to re watch this video 4 00:00:11,160 --> 00:00:14,000 as many times as it takes until it starts to really stick. 5 00:00:14,220 --> 00:00:21,410 So by definition normalization is the process of organizing the tables and columns in a relational database 6 00:00:21,870 --> 00:00:25,520 to reduce redundancy and preserve data integrity. 7 00:00:25,620 --> 00:00:30,290 So a lot of fancy words they're kind of tough to understand what that really means. 8 00:00:30,510 --> 00:00:33,900 But basically it's used to do three different things. 9 00:00:33,900 --> 00:00:40,230 Number one eliminate redundant data which helps to decrease table sizes and more importantly reduce 10 00:00:40,230 --> 00:00:43,500 processing speed and improve efficiency. 11 00:00:43,500 --> 00:00:49,140 Number two helps us minimize errors and anomalies when we make data modifications. 12 00:00:49,140 --> 00:00:53,240 So if we're to insert or update or delete records in our database. 13 00:00:53,460 --> 00:01:00,030 And number three it helps simplify queries and structure the database in a way that enables meaningful 14 00:01:00,090 --> 00:01:01,770 useful analysis. 15 00:01:01,770 --> 00:01:04,620 So still feels kind of over complicated. 16 00:01:04,680 --> 00:01:05,780 If you asked me. 17 00:01:05,790 --> 00:01:10,910 So my tip to remember what normalization is all about is to think of it this way. 18 00:01:10,980 --> 00:01:18,060 In a properly normalized database every table should serve it distinct and specific purpose. 19 00:01:18,180 --> 00:01:23,880 So you might have one table that only gives you information about products you have another that only 20 00:01:23,880 --> 00:01:27,090 gives you information about dates like a calendar table. 21 00:01:27,330 --> 00:01:33,610 You might have one that's only daily Transactional Records and another that's only about customers. 22 00:01:33,660 --> 00:01:38,940 Now this should sound pretty familiar because these are the exact type of tables that we're using here 23 00:01:39,000 --> 00:01:41,070 in this adventure works demo. 24 00:01:41,070 --> 00:01:47,070 So let me take a stab at visualizing why normalization is such an important concept consider a table 25 00:01:47,070 --> 00:01:48,150 like this. 26 00:01:48,150 --> 00:01:55,110 You've got transaction quantities here in the third column broken down by product ID and by date as 27 00:01:55,110 --> 00:02:02,430 well as all of this extra information about each product ID the brand the name the skew and the weight. 28 00:02:02,430 --> 00:02:08,100 And as you can see just from this small sample that we have multiple transactions or multiple quantity 29 00:02:08,100 --> 00:02:13,440 values per day and multiple quantity values per product ID. 30 00:02:13,530 --> 00:02:16,120 So this table is not normalized. 31 00:02:16,350 --> 00:02:19,020 It doesn't serve a single unique purpose. 32 00:02:19,020 --> 00:02:25,230 It's actually serving at least two purposes one providing the transaction quantity by date and product 33 00:02:25,230 --> 00:02:30,560 ID and to providing additional attributes about those products. 34 00:02:30,690 --> 00:02:32,910 Those are two different purposes. 35 00:02:32,970 --> 00:02:37,060 So what you end up with here are all of these duplicate rows. 36 00:02:37,080 --> 00:02:41,250 In any case where the same product ID appears more than once. 37 00:02:41,310 --> 00:02:46,420 So you see duplicate brand names product names duplicate Skewes and product weights. 38 00:02:46,760 --> 00:02:49,730 And you might be wondering OK that's not that big a deal. 39 00:02:49,740 --> 00:02:54,330 I'm still getting the information that I need in fact to have it all in one place in a single table 40 00:02:54,330 --> 00:02:54,990 which is great. 41 00:02:54,990 --> 00:02:56,960 So I don't see the downside here. 42 00:02:57,270 --> 00:03:03,480 Well imagine if we were dealing with 100 different products and each of those products on average sold 43 00:03:03,490 --> 00:03:05,160 10000 times a day. 44 00:03:05,310 --> 00:03:11,360 Now all of a sudden you're talking about a million duplicate rows for every single date in the data 45 00:03:11,360 --> 00:03:12,220 set. 46 00:03:12,300 --> 00:03:18,450 So you can see that with larger more complex models minor inefficiencies like this can become major 47 00:03:18,450 --> 00:03:19,690 major problems. 48 00:03:19,770 --> 00:03:21,710 As you scale up in size. 49 00:03:21,780 --> 00:03:28,830 So the way to avoid issues like this is to strip those product attribute columns out of this table and 50 00:03:28,830 --> 00:03:32,180 create a relationship with a single product Look-Up. 51 00:03:32,330 --> 00:03:38,580 And if that product look up contains a unique list of product IDs with those associated attributes then 52 00:03:38,580 --> 00:03:41,370 we can access the exact same information here. 53 00:03:41,550 --> 00:03:44,860 Well eliminating every one of those duplicate rows. 54 00:03:44,880 --> 00:03:50,070 So again this concept may still feel a little bit ambiguous but trust me we're going to get our hands 55 00:03:50,070 --> 00:03:50,730 dirty. 56 00:03:50,730 --> 00:03:55,140 We're going to do a ton of demos walk through a bunch of samples and this is going to start to feel 57 00:03:55,140 --> 00:03:58,970 much more natural as we continue through this section of the course. 58 00:03:58,980 --> 00:04:04,500 So next up we're going to talk about data tables and look up tables as our first step towards building 59 00:04:04,800 --> 00:04:06,510 a properly normalized model. 6118