Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
1
00:00:00,640 --> 00:00:00,980
All right.
2
00:00:00,990 --> 00:00:04,040
Continue on with this concept of normalization.
3
00:00:04,050 --> 00:00:07,560
Let's talk about data tables versus lookup tables.
4
00:00:07,740 --> 00:00:13,920
Now generally speaking data models contain two different types of tables data sometimes known as fact
5
00:00:13,920 --> 00:00:14,660
tables.
6
00:00:14,880 --> 00:00:21,630
And look up or dimension tables and the way to kind of differentiate between the two is that data tables
7
00:00:21,990 --> 00:00:27,410
contain those quantitative metrics the numbers or values that you really care about.
8
00:00:27,720 --> 00:00:34,560
Typically these data tables are pretty granular detailed level with ID or key columns that can be used
9
00:00:34,560 --> 00:00:38,060
to create table relationships to look ups.
10
00:00:38,160 --> 00:00:45,990
Now on the other hand look up tables they provide descriptive often text based attributes or characteristics
11
00:00:46,470 --> 00:00:48,190
about each dimension and table.
12
00:00:48,330 --> 00:00:54,240
So you remember in the last lecture I talked about each table in a model serving a specific and distinct
13
00:00:54,240 --> 00:00:55,430
purpose.
14
00:00:55,440 --> 00:01:02,310
Well that's exactly what lookup tables are doing they're providing rich information and attributes about
15
00:01:02,310 --> 00:01:06,370
things like customers like products like dates.
16
00:01:06,630 --> 00:01:10,280
And that's exactly the role that they play within our model.
17
00:01:10,290 --> 00:01:17,430
So let's walk through an example here this blue table is a data table that contains numerical values
18
00:01:17,440 --> 00:01:24,670
those quantities along with two key columns that we can use to connect this data table with some lookups
19
00:01:24,900 --> 00:01:27,580
a date column and a product ID column.
20
00:01:27,910 --> 00:01:34,030
So what we'll be able to do is take that date column and connect it to a date column within a calendar
21
00:01:34,030 --> 00:01:38,300
lookup table and the entire purpose of the calendar lookup table.
22
00:01:38,430 --> 00:01:44,590
It's entire reason for existing is to provide additional information about each date.
23
00:01:44,800 --> 00:01:50,740
So it's got one row per date and additional attributes like the day of the month the month number the
24
00:01:50,740 --> 00:01:53,920
year the week day and so on and so forth.
25
00:01:53,920 --> 00:01:58,780
So the bottom line is that if you know the date you also know all of this additional information about
26
00:01:58,780 --> 00:02:05,850
months quarters years weeks etc. and the calendar Look-Up provides or gives us access to that information.
27
00:02:06,280 --> 00:02:13,120
Same story holds for the Product ID that product ID column in our data table can connect to or relate
28
00:02:13,120 --> 00:02:20,590
to it matching product ID in a product lookup table and that product lookup tables purpose is to provide
29
00:02:20,650 --> 00:02:25,130
all sorts of product specific information about each ID.
30
00:02:25,150 --> 00:02:31,900
So similar concept to the calendar table just in this case if you know the product ID then by definition
31
00:02:31,900 --> 00:02:37,390
you know that brand you know the product name you know the skew the retail price the cost the weight
32
00:02:37,720 --> 00:02:42,800
and any other product specific attributes captured by this product lookup table.
33
00:02:43,210 --> 00:02:47,930
So in order to understand the real mechanics of how these relationships work.
34
00:02:47,950 --> 00:02:53,910
One of the first things we need to dig into is the idea of primary versus foreign keys.
35
00:02:53,920 --> 00:02:57,720
Now consider the same set of three tables or blue data table.
36
00:02:57,890 --> 00:03:04,360
Our green calendar and our orange product lookup table consider the date and product ID columns in our
37
00:03:04,360 --> 00:03:05,660
data table.
38
00:03:05,710 --> 00:03:11,560
These are called foreign keys they're keys because they're used to create relationships with our lookup
39
00:03:11,560 --> 00:03:17,440
tables and they're foreign because they contain multiple instances of each value.
40
00:03:17,530 --> 00:03:20,140
And this is the case with many data tables.
41
00:03:20,170 --> 00:03:27,040
Here we have multiple transactions and multiple quantity values per day and multiple transactions and
42
00:03:27,040 --> 00:03:29,720
quantity values per product ID.
43
00:03:29,800 --> 00:03:36,130
So we could potentially see many many duplicate values in both of those date and product ID columns.
44
00:03:36,130 --> 00:03:42,190
Now when we look at the same columns in the lookup tables date in the calendar table and product ID
45
00:03:42,310 --> 00:03:49,330
in the product lookup table These are called primary keys because they uniquely identify each row of
46
00:03:49,330 --> 00:03:54,490
the table and they match the foreign keys in any related data tables.
47
00:03:54,490 --> 00:03:56,260
So it's easy to mix these up.
48
00:03:56,410 --> 00:04:03,110
But as a rule of thumb primary keys are unique foreign keys often contain duplicates.
49
00:04:03,280 --> 00:04:05,140
So let's jump in a power be-I.
50
00:04:05,290 --> 00:04:10,210
I want to open up the relationships view and see if we can identify some of these primary and foreign
51
00:04:10,210 --> 00:04:14,280
keys in the Adventure Works tables that we're working with.
52
00:04:14,410 --> 00:04:14,820
All right.
53
00:04:14,860 --> 00:04:20,040
So once you've got your file open we're going to go ahead and navigate to the relationships.
54
00:04:20,650 --> 00:04:25,720
And we're going to be spending a lot of time here because this is where the data modeling magic really
55
00:04:25,720 --> 00:04:26,680
happens.
56
00:04:26,950 --> 00:04:32,500
And the first thing I want to do is just looking at these tables that we've got I identify which are
57
00:04:32,500 --> 00:04:35,260
data tables and which are look at tables.
58
00:04:35,260 --> 00:04:37,960
Now we've already started hinting at this quite a bit.
59
00:04:37,990 --> 00:04:43,870
We know that the one table that contains our quantitative metrics the values that we care about the
60
00:04:43,870 --> 00:04:47,290
order quantities is our A.W. sales table.
61
00:04:47,290 --> 00:04:54,010
So let's grab that and we're going to pull it down to kind of distinguish it from the lookup tables.
62
00:04:54,010 --> 00:04:58,720
So now we know this is our data table looking up at our other tables.
63
00:04:58,720 --> 00:05:02,530
Product Look-Up contains all sorts of information about products.
64
00:05:02,830 --> 00:05:07,300
Customer is all obviously focused on customer level information.
65
00:05:07,550 --> 00:05:13,810
Calendar is all about dates territory has information about sales regions countries and continents.
66
00:05:13,960 --> 00:05:20,080
And then we have some additional product related category and subcategory tables that are almost an
67
00:05:20,140 --> 00:05:22,720
orphan off here so come on guys.
68
00:05:24,020 --> 00:05:29,510
So all of these tables serve very specific distinct purposes.
69
00:05:29,720 --> 00:05:36,080
And I'm kind of an OCD person so I'm going to drag the product lookup table over here with its friends
70
00:05:36,860 --> 00:05:39,510
and click and drag and just rearrange things a bit.
71
00:05:39,690 --> 00:05:44,430
We're going to be doing more designing in here but there we go.
72
00:05:44,450 --> 00:05:50,050
So at this point we know which tables are lookup tables and we know which tables or data table.
73
00:05:50,300 --> 00:05:55,600
Let's take a quick pass through and see if we can identify our primary and foreign keys.
74
00:05:55,640 --> 00:06:02,600
So starting with our look ups customer look up clearly has a customer key that uniquely identify each
75
00:06:02,600 --> 00:06:10,670
row within this table and therefore uniquely identify as all of this information about customers calendar
76
00:06:10,670 --> 00:06:11,380
look up.
77
00:06:11,450 --> 00:06:12,370
Same story here.
78
00:06:12,380 --> 00:06:15,890
It's not called a key but the date serves that purpose.
79
00:06:15,890 --> 00:06:21,080
If you know the date you know all of this information about that date the month that week the day of
80
00:06:21,080 --> 00:06:27,710
the week and so on and so forth territories we've got a clear sales territory key here which maps to
81
00:06:27,710 --> 00:06:32,310
regions countries and continents and then products are a little bit more complicated.
82
00:06:32,330 --> 00:06:36,040
We're going to get into this later when we talk about snowflake's agreements.
83
00:06:36,320 --> 00:06:41,690
But what you may notice is that some of these tables actually contain more than one key but for the
84
00:06:41,690 --> 00:06:44,480
time being pretend like you never even saw that.
85
00:06:45,050 --> 00:06:48,840
Now shifting down to the data table let's talk about foreign keys.
86
00:06:49,100 --> 00:06:52,340
So we know that multiple sales take place per day.
87
00:06:52,430 --> 00:06:58,490
So both of these date fields order date and stock date are both foreign keys because there are multiple
88
00:06:58,490 --> 00:07:00,160
instances of each value.
89
00:07:00,410 --> 00:07:06,410
And those dates can tie into the date field or calendar look at similar starting with product key got
90
00:07:06,410 --> 00:07:08,230
multiple sales per product.
91
00:07:08,300 --> 00:07:09,760
Same with customer key.
92
00:07:09,770 --> 00:07:11,210
Same with territory.
93
00:07:11,210 --> 00:07:17,220
So all three of these keys plus the two dates make up our foreign keys within the data table.
94
00:07:17,270 --> 00:07:19,180
So that's about enough for now.
95
00:07:19,250 --> 00:07:24,520
That's your crash course in data versus look up tables and primary versus foreign keys.
10415
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.