All language subtitles for 4. Understanding Data Tables vs. Lookup Tables

af Afrikaans
sq Albanian
am Amharic
ar Arabic Download
hy Armenian
az Azerbaijani
eu Basque
be Belarusian
bn Bengali
bs Bosnian
bg Bulgarian
ca Catalan
ceb Cebuano
ny Chichewa
zh-CN Chinese (Simplified)
zh-TW Chinese (Traditional)
co Corsican
hr Croatian
cs Czech
da Danish
nl Dutch
en English
eo Esperanto
et Estonian
tl Filipino
fi Finnish
fr French
fy Frisian
gl Galician
ka Georgian
de German
el Greek
gu Gujarati
ht Haitian Creole
ha Hausa
haw Hawaiian
iw Hebrew
hi Hindi
hmn Hmong
hu Hungarian
is Icelandic
ig Igbo
id Indonesian
ga Irish
it Italian
ja Japanese
jw Javanese
kn Kannada
kk Kazakh
km Khmer
ko Korean
ku Kurdish (Kurmanji)
ky Kyrgyz
lo Lao
la Latin
lv Latvian
lt Lithuanian
lb Luxembourgish
mk Macedonian
mg Malagasy
ms Malay
ml Malayalam
mt Maltese
mi Maori
mr Marathi
mn Mongolian
my Myanmar (Burmese)
ne Nepali
no Norwegian
ps Pashto
fa Persian
pl Polish
pt Portuguese
pa Punjabi
ro Romanian
ru Russian
sm Samoan
gd Scots Gaelic
sr Serbian
st Sesotho
sn Shona
sd Sindhi
si Sinhala
sk Slovak
sl Slovenian
so Somali
es Spanish
su Sundanese
sw Swahili
sv Swedish
tg Tajik
ta Tamil
te Telugu
th Thai
tr Turkish
uk Ukrainian
ur Urdu
uz Uzbek
vi Vietnamese
cy Welsh
xh Xhosa
yi Yiddish
yo Yoruba
zu Zulu
or Odia (Oriya)
rw Kinyarwanda
tk Turkmen
tt Tatar
ug Uyghur
Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 00:00:00,640 --> 00:00:00,980 All right. 2 00:00:00,990 --> 00:00:04,040 Continue on with this concept of normalization. 3 00:00:04,050 --> 00:00:07,560 Let's talk about data tables versus lookup tables. 4 00:00:07,740 --> 00:00:13,920 Now generally speaking data models contain two different types of tables data sometimes known as fact 5 00:00:13,920 --> 00:00:14,660 tables. 6 00:00:14,880 --> 00:00:21,630 And look up or dimension tables and the way to kind of differentiate between the two is that data tables 7 00:00:21,990 --> 00:00:27,410 contain those quantitative metrics the numbers or values that you really care about. 8 00:00:27,720 --> 00:00:34,560 Typically these data tables are pretty granular detailed level with ID or key columns that can be used 9 00:00:34,560 --> 00:00:38,060 to create table relationships to look ups. 10 00:00:38,160 --> 00:00:45,990 Now on the other hand look up tables they provide descriptive often text based attributes or characteristics 11 00:00:46,470 --> 00:00:48,190 about each dimension and table. 12 00:00:48,330 --> 00:00:54,240 So you remember in the last lecture I talked about each table in a model serving a specific and distinct 13 00:00:54,240 --> 00:00:55,430 purpose. 14 00:00:55,440 --> 00:01:02,310 Well that's exactly what lookup tables are doing they're providing rich information and attributes about 15 00:01:02,310 --> 00:01:06,370 things like customers like products like dates. 16 00:01:06,630 --> 00:01:10,280 And that's exactly the role that they play within our model. 17 00:01:10,290 --> 00:01:17,430 So let's walk through an example here this blue table is a data table that contains numerical values 18 00:01:17,440 --> 00:01:24,670 those quantities along with two key columns that we can use to connect this data table with some lookups 19 00:01:24,900 --> 00:01:27,580 a date column and a product ID column. 20 00:01:27,910 --> 00:01:34,030 So what we'll be able to do is take that date column and connect it to a date column within a calendar 21 00:01:34,030 --> 00:01:38,300 lookup table and the entire purpose of the calendar lookup table. 22 00:01:38,430 --> 00:01:44,590 It's entire reason for existing is to provide additional information about each date. 23 00:01:44,800 --> 00:01:50,740 So it's got one row per date and additional attributes like the day of the month the month number the 24 00:01:50,740 --> 00:01:53,920 year the week day and so on and so forth. 25 00:01:53,920 --> 00:01:58,780 So the bottom line is that if you know the date you also know all of this additional information about 26 00:01:58,780 --> 00:02:05,850 months quarters years weeks etc. and the calendar Look-Up provides or gives us access to that information. 27 00:02:06,280 --> 00:02:13,120 Same story holds for the Product ID that product ID column in our data table can connect to or relate 28 00:02:13,120 --> 00:02:20,590 to it matching product ID in a product lookup table and that product lookup tables purpose is to provide 29 00:02:20,650 --> 00:02:25,130 all sorts of product specific information about each ID. 30 00:02:25,150 --> 00:02:31,900 So similar concept to the calendar table just in this case if you know the product ID then by definition 31 00:02:31,900 --> 00:02:37,390 you know that brand you know the product name you know the skew the retail price the cost the weight 32 00:02:37,720 --> 00:02:42,800 and any other product specific attributes captured by this product lookup table. 33 00:02:43,210 --> 00:02:47,930 So in order to understand the real mechanics of how these relationships work. 34 00:02:47,950 --> 00:02:53,910 One of the first things we need to dig into is the idea of primary versus foreign keys. 35 00:02:53,920 --> 00:02:57,720 Now consider the same set of three tables or blue data table. 36 00:02:57,890 --> 00:03:04,360 Our green calendar and our orange product lookup table consider the date and product ID columns in our 37 00:03:04,360 --> 00:03:05,660 data table. 38 00:03:05,710 --> 00:03:11,560 These are called foreign keys they're keys because they're used to create relationships with our lookup 39 00:03:11,560 --> 00:03:17,440 tables and they're foreign because they contain multiple instances of each value. 40 00:03:17,530 --> 00:03:20,140 And this is the case with many data tables. 41 00:03:20,170 --> 00:03:27,040 Here we have multiple transactions and multiple quantity values per day and multiple transactions and 42 00:03:27,040 --> 00:03:29,720 quantity values per product ID. 43 00:03:29,800 --> 00:03:36,130 So we could potentially see many many duplicate values in both of those date and product ID columns. 44 00:03:36,130 --> 00:03:42,190 Now when we look at the same columns in the lookup tables date in the calendar table and product ID 45 00:03:42,310 --> 00:03:49,330 in the product lookup table These are called primary keys because they uniquely identify each row of 46 00:03:49,330 --> 00:03:54,490 the table and they match the foreign keys in any related data tables. 47 00:03:54,490 --> 00:03:56,260 So it's easy to mix these up. 48 00:03:56,410 --> 00:04:03,110 But as a rule of thumb primary keys are unique foreign keys often contain duplicates. 49 00:04:03,280 --> 00:04:05,140 So let's jump in a power be-I. 50 00:04:05,290 --> 00:04:10,210 I want to open up the relationships view and see if we can identify some of these primary and foreign 51 00:04:10,210 --> 00:04:14,280 keys in the Adventure Works tables that we're working with. 52 00:04:14,410 --> 00:04:14,820 All right. 53 00:04:14,860 --> 00:04:20,040 So once you've got your file open we're going to go ahead and navigate to the relationships. 54 00:04:20,650 --> 00:04:25,720 And we're going to be spending a lot of time here because this is where the data modeling magic really 55 00:04:25,720 --> 00:04:26,680 happens. 56 00:04:26,950 --> 00:04:32,500 And the first thing I want to do is just looking at these tables that we've got I identify which are 57 00:04:32,500 --> 00:04:35,260 data tables and which are look at tables. 58 00:04:35,260 --> 00:04:37,960 Now we've already started hinting at this quite a bit. 59 00:04:37,990 --> 00:04:43,870 We know that the one table that contains our quantitative metrics the values that we care about the 60 00:04:43,870 --> 00:04:47,290 order quantities is our A.W. sales table. 61 00:04:47,290 --> 00:04:54,010 So let's grab that and we're going to pull it down to kind of distinguish it from the lookup tables. 62 00:04:54,010 --> 00:04:58,720 So now we know this is our data table looking up at our other tables. 63 00:04:58,720 --> 00:05:02,530 Product Look-Up contains all sorts of information about products. 64 00:05:02,830 --> 00:05:07,300 Customer is all obviously focused on customer level information. 65 00:05:07,550 --> 00:05:13,810 Calendar is all about dates territory has information about sales regions countries and continents. 66 00:05:13,960 --> 00:05:20,080 And then we have some additional product related category and subcategory tables that are almost an 67 00:05:20,140 --> 00:05:22,720 orphan off here so come on guys. 68 00:05:24,020 --> 00:05:29,510 So all of these tables serve very specific distinct purposes. 69 00:05:29,720 --> 00:05:36,080 And I'm kind of an OCD person so I'm going to drag the product lookup table over here with its friends 70 00:05:36,860 --> 00:05:39,510 and click and drag and just rearrange things a bit. 71 00:05:39,690 --> 00:05:44,430 We're going to be doing more designing in here but there we go. 72 00:05:44,450 --> 00:05:50,050 So at this point we know which tables are lookup tables and we know which tables or data table. 73 00:05:50,300 --> 00:05:55,600 Let's take a quick pass through and see if we can identify our primary and foreign keys. 74 00:05:55,640 --> 00:06:02,600 So starting with our look ups customer look up clearly has a customer key that uniquely identify each 75 00:06:02,600 --> 00:06:10,670 row within this table and therefore uniquely identify as all of this information about customers calendar 76 00:06:10,670 --> 00:06:11,380 look up. 77 00:06:11,450 --> 00:06:12,370 Same story here. 78 00:06:12,380 --> 00:06:15,890 It's not called a key but the date serves that purpose. 79 00:06:15,890 --> 00:06:21,080 If you know the date you know all of this information about that date the month that week the day of 80 00:06:21,080 --> 00:06:27,710 the week and so on and so forth territories we've got a clear sales territory key here which maps to 81 00:06:27,710 --> 00:06:32,310 regions countries and continents and then products are a little bit more complicated. 82 00:06:32,330 --> 00:06:36,040 We're going to get into this later when we talk about snowflake's agreements. 83 00:06:36,320 --> 00:06:41,690 But what you may notice is that some of these tables actually contain more than one key but for the 84 00:06:41,690 --> 00:06:44,480 time being pretend like you never even saw that. 85 00:06:45,050 --> 00:06:48,840 Now shifting down to the data table let's talk about foreign keys. 86 00:06:49,100 --> 00:06:52,340 So we know that multiple sales take place per day. 87 00:06:52,430 --> 00:06:58,490 So both of these date fields order date and stock date are both foreign keys because there are multiple 88 00:06:58,490 --> 00:07:00,160 instances of each value. 89 00:07:00,410 --> 00:07:06,410 And those dates can tie into the date field or calendar look at similar starting with product key got 90 00:07:06,410 --> 00:07:08,230 multiple sales per product. 91 00:07:08,300 --> 00:07:09,760 Same with customer key. 92 00:07:09,770 --> 00:07:11,210 Same with territory. 93 00:07:11,210 --> 00:07:17,220 So all three of these keys plus the two dates make up our foreign keys within the data table. 94 00:07:17,270 --> 00:07:19,180 So that's about enough for now. 95 00:07:19,250 --> 00:07:24,520 That's your crash course in data versus look up tables and primary versus foreign keys. 10415

Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.