Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
1
00:00:00,000 --> 00:00:01,000
Narrator: Jumping right in,
2
00:00:01,000 --> 00:00:04,000
if you go to the Query Editor and look at the Home tab,
3
00:00:04,000 --> 00:00:08,000
you'll see all sorts of basic table transformation options.
4
00:00:08,000 --> 00:00:10,000
Let's talk through some of the most common
5
00:00:10,000 --> 00:00:11,000
and frequently used ones.
6
00:00:13,000 --> 00:00:14,000
First off, you can choose to keep
7
00:00:14,000 --> 00:00:17,000
or remove certain columns or rows from a table,
8
00:00:17,000 --> 00:00:19,000
and there are a few interesting options here.
9
00:00:19,000 --> 00:00:21,000
Looking at the column options,
10
00:00:21,000 --> 00:00:24,000
you can either select the columns that you'd like to remove
11
00:00:24,000 --> 00:00:27,000
or select the columns that you'd like to keep.
12
00:00:27,000 --> 00:00:28,000
Now, that might not seem
13
00:00:28,000 --> 00:00:30,000
like a really important distinction,
14
00:00:30,000 --> 00:00:33,000
but it can be very helpful in certain cases.
15
00:00:33,000 --> 00:00:36,000
So let's say that you've got a data connection to a table
16
00:00:36,000 --> 00:00:38,000
that might have a variable number of columns.
17
00:00:38,000 --> 00:00:41,000
Let's say one month the table has 10,
18
00:00:41,000 --> 00:00:43,000
and then the next month it has 20.
19
00:00:43,000 --> 00:00:44,000
But at the end of the day,
20
00:00:44,000 --> 00:00:48,000
you always want to keep the same set of five columns.
21
00:00:48,000 --> 00:00:50,000
If you were to open up the Query Editor
22
00:00:50,000 --> 00:00:53,000
and manually remove each of the columns that you don't want,
23
00:00:53,000 --> 00:00:57,000
then any new columns that appear won't be accounted for
24
00:00:57,000 --> 00:00:59,000
and will ultimately end up in your final table,
25
00:00:59,000 --> 00:01:01,000
which is what you don't want.
26
00:01:01,000 --> 00:01:02,000
On the other hand,
27
00:01:02,000 --> 00:01:05,000
if you select the five columns that you always want and say,
28
00:01:05,000 --> 00:01:08,000
"Hey, Power BI, these are the five columns that I need.
29
00:01:08,000 --> 00:01:12,000
Anything else that you run into or find in that table,
30
00:01:12,000 --> 00:01:13,000
just remove it."
31
00:01:13,000 --> 00:01:15,000
All right, so taking this approach,
32
00:01:15,000 --> 00:01:17,000
you'll be good to go moving forward,
33
00:01:17,000 --> 00:01:19,000
regardless of the other columns
34
00:01:19,000 --> 00:01:21,000
that are present within the table.
35
00:01:21,000 --> 00:01:22,000
I know it's a bit of a nuance,
36
00:01:22,000 --> 00:01:24,000
but it's actually a really helpful tool
37
00:01:24,000 --> 00:01:27,000
when you run into these types of use cases.
38
00:01:27,000 --> 00:01:29,000
Now, as far as rows are concerned
39
00:01:29,000 --> 00:01:31,000
you've got some basic options here.
40
00:01:31,000 --> 00:01:33,000
You can remove the top rows or bottom rows.
41
00:01:33,000 --> 00:01:37,000
You can set a specific number of rows to keep or remove,
42
00:01:37,000 --> 00:01:39,000
and this is really helpful if you have files
43
00:01:39,000 --> 00:01:41,000
that might have header rows or footers
44
00:01:41,000 --> 00:01:43,000
or totals that you need to strip out.
45
00:01:43,000 --> 00:01:47,000
Other options here, you can remove alternate rows,
46
00:01:47,000 --> 00:01:48,000
or you can remove duplicates.
47
00:01:48,000 --> 00:01:49,000
And that could be a great way
48
00:01:49,000 --> 00:01:52,000
to actually create a new lookup table from scratch
49
00:01:52,000 --> 00:01:55,000
by creating a unique list of IDs.
50
00:01:56,000 --> 00:01:58,000
You'll also find some basic sorting options,
51
00:01:58,000 --> 00:02:00,000
like A to Z or low to high,
52
00:02:01,000 --> 00:02:04,000
as well as options to change your data types
53
00:02:04,000 --> 00:02:06,000
or promote your header rows.
54
00:02:06,000 --> 00:02:08,000
Now, Power BI is actually pretty smart about headers,
55
00:02:08,000 --> 00:02:10,000
and by default it should identify
56
00:02:10,000 --> 00:02:12,000
and promote those header rows for you.
57
00:02:12,000 --> 00:02:14,000
But in case it doesn't,
58
00:02:14,000 --> 00:02:17,000
you can always do it manually using these tools here.
59
00:02:18,000 --> 00:02:20,000
And then the last thing I want to call out,
60
00:02:20,000 --> 00:02:23,000
if you right-click any column header,
61
00:02:23,000 --> 00:02:26,000
you can access a lot of these same common tools.
62
00:02:26,000 --> 00:02:29,000
Plus, you can do things like duplicate, move,
63
00:02:29,000 --> 00:02:31,000
or rename columns.
64
00:02:31,000 --> 00:02:32,000
And this is a common theme
65
00:02:32,000 --> 00:02:34,000
that we'll see time and time again
66
00:02:34,000 --> 00:02:35,000
as we work with the Query Editor,
67
00:02:35,000 --> 00:02:38,000
which is that the same tools are often available
68
00:02:38,000 --> 00:02:41,000
in multiple places, in multiple menus, multiple tabs,
69
00:02:41,000 --> 00:02:44,000
sometimes even in multiple views within Power BI.
70
00:02:44,000 --> 00:02:46,000
So with that, let's head back over
71
00:02:46,000 --> 00:02:48,000
to our AdventureWorks Report,
72
00:02:48,000 --> 00:02:50,000
and we're gonna connect to our product lookup data.
73
00:02:50,000 --> 00:02:52,000
So like I've mentioned,
74
00:02:52,000 --> 00:02:55,000
there are multiple ways to connect to data in Power BI.
75
00:02:55,000 --> 00:02:56,000
And from within the Query Editor,
76
00:02:56,000 --> 00:03:00,000
New Source is going to open the Get Data menu.
77
00:03:00,000 --> 00:03:01,000
And we can click the bottom of it
78
00:03:01,000 --> 00:03:04,000
to see some of the most common options here.
79
00:03:04,000 --> 00:03:07,000
We have our Text and CSV option.
80
00:03:07,000 --> 00:03:10,000
And if we click More, Power BI is gonna launch
81
00:03:10,000 --> 00:03:13,000
that same Get Data interface, right?
82
00:03:13,000 --> 00:03:17,000
And all we need to do is click Text/CSV and then Connect,
83
00:03:17,000 --> 00:03:20,000
right, so multiple ways to accomplish the same thing.
84
00:03:20,000 --> 00:03:24,000
So from here, we're gonna locate our product lookup,
85
00:03:24,000 --> 00:03:26,000
right, so AdventureWorks Product Lookup.
86
00:03:28,000 --> 00:03:30,000
And we'll open this.
87
00:03:30,000 --> 00:03:33,000
Like we talked about, this process becomes very easy.
88
00:03:33,000 --> 00:03:34,000
It's a very repeatable thing,
89
00:03:34,000 --> 00:03:37,000
and you'll get very used to doing this over and over again.
90
00:03:37,000 --> 00:03:40,000
So we're brought back in to our data preview window,
91
00:03:40,000 --> 00:03:44,000
file origin, our comma delimiter,
92
00:03:44,000 --> 00:03:47,000
the data type detection based on our first 200 rows.
93
00:03:47,000 --> 00:03:49,000
Again, we're gonna dig into all of this
94
00:03:49,000 --> 00:03:52,000
within the Query Editor, but, you know, overall this looks
95
00:03:52,000 --> 00:03:56,000
like what I would expect to see from this kinda data set.
96
00:03:56,000 --> 00:03:57,000
So we'll click OK,
97
00:03:57,000 --> 00:03:59,000
and this is gonna add this into the Query Editor.
98
00:04:02,000 --> 00:04:03,000
So now over here on the left-hand side,
99
00:04:03,000 --> 00:04:05,000
we've got both of our queries, right?
100
00:04:05,000 --> 00:04:08,000
We've got our Territory Lookup table,
101
00:04:08,000 --> 00:04:09,000
and we see our preview here.
102
00:04:09,000 --> 00:04:11,000
We have our couple of applied steps
103
00:04:11,000 --> 00:04:13,000
that automatically happen.
104
00:04:13,000 --> 00:04:16,000
And we've got our AdventureWorks Product Lookup table,
105
00:04:16,000 --> 00:04:17,000
same thing.
106
00:04:17,000 --> 00:04:22,000
Right, so we've got our query over here, our data preview.
107
00:04:22,000 --> 00:04:25,000
We've got our applied steps, our file name.
108
00:04:25,000 --> 00:04:29,000
And one of the things that I want to quickly look into is
109
00:04:29,000 --> 00:04:31,000
what exactly is happening with these applied steps?
110
00:04:31,000 --> 00:04:33,000
Right, we've got a Source, Promoted Header,
111
00:04:33,000 --> 00:04:35,000
and Changed Type applied step.
112
00:04:35,000 --> 00:04:37,000
So because these are all individual,
113
00:04:37,000 --> 00:04:39,000
we can click into each one of these
114
00:04:39,000 --> 00:04:41,000
and see exactly what's going on.
115
00:04:41,000 --> 00:04:44,000
Right, and if I expand my formula bar,
116
00:04:44,000 --> 00:04:46,000
and we've got this M code here that's saying,
117
00:04:46,000 --> 00:04:48,000
"Hey, we've got this CSV document,
118
00:04:48,000 --> 00:04:50,000
here's its location," right?
119
00:04:50,000 --> 00:04:52,000
And this is gonna be unique to me, right?
120
00:04:52,000 --> 00:04:55,000
So this is my file drive location,
121
00:04:55,000 --> 00:04:56,000
and here's the delimiter,
122
00:04:56,000 --> 00:04:59,000
the number of columns, the file encoding.
123
00:04:59,000 --> 00:05:01,000
Right, so this is all of the data
124
00:05:01,000 --> 00:05:04,000
that Power BI needs about the source.
125
00:05:04,000 --> 00:05:06,000
The next thing that it looks at, as you can see here,
126
00:05:06,000 --> 00:05:09,000
we've got column one, column two, column three,
127
00:05:09,000 --> 00:05:12,000
is it says, "Hey, I recognize that the next row here
128
00:05:12,000 --> 00:05:14,000
is actually the header row,
129
00:05:14,000 --> 00:05:17,000
so I'm going to promote those to the header row."
130
00:05:17,000 --> 00:05:19,000
Right, so now we have ProductKey, ProductSubcategoryKey,
131
00:05:20,000 --> 00:05:23,000
ProductSKU, ProductName, and so on.
132
00:05:23,000 --> 00:05:25,000
And if you look at all of these data types
133
00:05:25,000 --> 00:05:27,000
on the left-hand side here, right,
134
00:05:27,000 --> 00:05:30,000
they're all text base, right?
135
00:05:30,000 --> 00:05:32,000
You can scroll over,
136
00:05:32,000 --> 00:05:34,000
and you'll see that these are all text base.
137
00:05:34,000 --> 00:05:36,000
So the next thing the Power BI does is it says,
138
00:05:36,000 --> 00:05:41,000
"All right, based on the first 200 rows of this file,
139
00:05:41,000 --> 00:05:44,000
I think that these are the most appropriate data types
140
00:05:44,000 --> 00:05:46,000
to assign to each one of these columns."
141
00:05:46,000 --> 00:05:49,000
So we've got our ProductKey here set up as an integer,
142
00:05:49,000 --> 00:05:51,000
our ProductSubcategoryKey,
143
00:05:51,000 --> 00:05:55,000
same thing, as a whole number or an integer value, right?
144
00:05:55,000 --> 00:05:59,000
We've got ProductSKU as a text-based column.
145
00:05:59,000 --> 00:06:00,000
We scroll over to the right here,
146
00:06:00,000 --> 00:06:04,000
again, a bunch of other text-based columns.
147
00:06:04,000 --> 00:06:06,000
And then we get to the end of the data set here,
148
00:06:06,000 --> 00:06:07,000
and we see two columns,
149
00:06:07,000 --> 00:06:10,000
ProductCost and ProductPrice, right?
150
00:06:10,000 --> 00:06:14,000
And these are being assigned as decimal numbers.
151
00:06:14,000 --> 00:06:16,000
Again, so Power BI is going through
152
00:06:16,000 --> 00:06:19,000
and scanning the first 200 rows of this table,
153
00:06:19,000 --> 00:06:21,000
and it's saying, "All right, here's my best guess
154
00:06:21,000 --> 00:06:22,000
at what these different data types are."
155
00:06:22,000 --> 00:06:25,000
So that's what those applied steps are all about.
156
00:06:25,000 --> 00:06:28,000
Now, every time I connect to a brand new data set,
157
00:06:28,000 --> 00:06:30,000
I always like to start with two steps.
158
00:06:30,000 --> 00:06:33,000
The first step is to rename the file.
159
00:06:33,000 --> 00:06:34,000
Right, so I'm gonna delete
160
00:06:34,000 --> 00:06:36,000
the AdventureWorks portion of this,
161
00:06:36,000 --> 00:06:38,000
and we're just gonna call it Product Lookup.
162
00:06:38,000 --> 00:06:42,000
The lookup portion of this is going to be important
163
00:06:42,000 --> 00:06:44,000
once we get to the data modeling section
164
00:06:44,000 --> 00:06:47,000
Chris is gonna talk through why we name our tables that way
165
00:06:47,000 --> 00:06:48,000
and how it's important.
166
00:06:48,000 --> 00:06:51,000
The next thing that I like to do is really go through
167
00:06:51,000 --> 00:06:53,000
and make sure that all of the data types here
168
00:06:53,000 --> 00:06:55,000
are assigned appropriately, right?
169
00:06:55,000 --> 00:06:57,000
And we've kind of already done that, right?
170
00:06:57,000 --> 00:07:00,000
We confirmed that the product keys, subcategory keys,
171
00:07:00,000 --> 00:07:02,000
these make sense as whole numbers.
172
00:07:02,000 --> 00:07:04,000
Everything that looks like it should be
173
00:07:04,000 --> 00:07:07,000
a text-based value is.
174
00:07:07,000 --> 00:07:09,000
So we'll kinda scroll through these again.
175
00:07:09,000 --> 00:07:11,000
And then the one thing here that we can do
176
00:07:11,000 --> 00:07:15,000
is our ProductCost and ProductPrice columns,
177
00:07:15,000 --> 00:07:17,000
these defaulted to a decimal number,
178
00:07:17,000 --> 00:07:19,000
again, which isn't inaccurate,
179
00:07:19,000 --> 00:07:22,000
but we can actually be a little bit more accurate, right?
180
00:07:22,000 --> 00:07:25,000
So these are actually currencies, right?
181
00:07:25,000 --> 00:07:28,000
And that data type within Power Query is referred to
182
00:07:28,000 --> 00:07:30,000
as a fixed decimal number.
183
00:07:30,000 --> 00:07:32,000
Right, so we could click Fixed decimal number,
184
00:07:32,000 --> 00:07:35,000
and this is going to add a new step
185
00:07:36,000 --> 00:07:39,000
and change our data type, right?
186
00:07:39,000 --> 00:07:40,000
So instead of it being a decimal data type,
187
00:07:40,000 --> 00:07:42,000
we're changing it to currency.
188
00:07:42,000 --> 00:07:44,000
The other way that we could do this
189
00:07:44,000 --> 00:07:47,000
is we're gonna delete that last applied step,
190
00:07:47,000 --> 00:07:50,000
and I'm gonna select both of these columns.
191
00:07:50,000 --> 00:07:53,000
And then I want to head up here to the Transform,
192
00:07:53,000 --> 00:07:55,000
to my Data Type,
193
00:07:55,000 --> 00:07:57,000
and then Fixed decimal number.
194
00:07:58,000 --> 00:08:00,000
And we'll add this as a new step.
195
00:08:02,000 --> 00:08:05,000
Right, and so now this new Changed Type 1 step is saying,
196
00:08:05,000 --> 00:08:07,000
"Hey, we want to change
197
00:08:07,000 --> 00:08:09,000
or we want to transform these column types
198
00:08:09,000 --> 00:08:13,000
of ProductCost and ProductPrice both to Currency."
199
00:08:13,000 --> 00:08:15,000
So really cool that we can do that.
200
00:08:15,000 --> 00:08:17,000
One other thing that we could do here real quick
201
00:08:17,000 --> 00:08:20,000
is we can right-click on the step and rename it.
202
00:08:20,000 --> 00:08:23,000
And so let's rename this to
203
00:08:24,000 --> 00:08:26,000
Changed to Currency.
204
00:08:27,000 --> 00:08:30,000
Right, so now this is a little bit more readable,
205
00:08:30,000 --> 00:08:32,000
and we actually understand what's happening here.
206
00:08:32,000 --> 00:08:35,000
So overall things are looking really good.
207
00:08:35,000 --> 00:08:38,000
Right, there really isn't too much that we would have to do.
208
00:08:38,000 --> 00:08:40,000
We could probably load this file as is,
209
00:08:40,000 --> 00:08:42,000
and it would be totally workable.
210
00:08:42,000 --> 00:08:44,000
But let's say, for example,
211
00:08:44,000 --> 00:08:49,000
that we want to remove this ProductSize column.
212
00:08:49,000 --> 00:08:50,000
Right, when we start looking at it,
213
00:08:50,000 --> 00:08:54,000
we can kinda see that it's a little bit of a mess, right?
214
00:08:54,000 --> 00:08:57,000
We've got a mix between some numeric sizes
215
00:08:57,000 --> 00:08:58,000
and text-based sizes.
216
00:08:58,000 --> 00:09:01,000
And maybe we don't think that it's gonna be super helpful,
217
00:09:01,000 --> 00:09:03,000
so we just want to remove it.
218
00:09:04,000 --> 00:09:07,000
So again, there's multiple ways to accomplish this.
219
00:09:07,000 --> 00:09:10,000
We can right-click the ProductSize column header
220
00:09:10,000 --> 00:09:12,000
and click Remove, right?
221
00:09:12,000 --> 00:09:17,000
Or we can come up and remove columns from our header tools.
222
00:09:19,000 --> 00:09:22,000
Right, so Power Query adds a new step here
223
00:09:22,000 --> 00:09:24,000
for Removed Columns.
224
00:09:24,000 --> 00:09:27,000
We can see the associated M code generated here.
225
00:09:27,000 --> 00:09:29,000
We'll just collapse that formula bar back.
226
00:09:31,000 --> 00:09:33,000
And there we go. So it's really that easy.
227
00:09:33,000 --> 00:09:36,000
So just to quickly recap, as a best practice,
228
00:09:36,000 --> 00:09:40,000
try to get in the habit of updating your table names
229
00:09:40,000 --> 00:09:44,000
and confirming your data types and their accuracy
230
00:09:44,000 --> 00:09:46,000
every time you connect to a new data source.
231
00:09:46,000 --> 00:09:49,000
It's really paramount for everything downstream of it
232
00:09:49,000 --> 00:09:52,000
just to make sure that those are set up appropriately.
233
00:09:52,000 --> 00:09:55,000
All right, so, so far, we have connected to this data set.
234
00:09:55,000 --> 00:09:57,000
We've updated the table name.
235
00:09:57,000 --> 00:10:00,000
We've confirmed our data types.
236
00:10:00,000 --> 00:10:02,000
We updated a couple of data types to currency.
237
00:10:02,000 --> 00:10:04,000
We've removed our columns.
238
00:10:04,000 --> 00:10:08,000
So the next thing that we can do here is actually close
239
00:10:08,000 --> 00:10:10,000
and apply these changes, right?
240
00:10:10,000 --> 00:10:12,000
And what this does is this actually is going
241
00:10:12,000 --> 00:10:15,000
to load the data into the data model,
242
00:10:15,000 --> 00:10:17,000
and we'll see that in the front end of Power BI.
243
00:10:19,000 --> 00:10:21,000
Right, so now that that's finished loading,
244
00:10:21,000 --> 00:10:23,000
over here we've got both of our Product Lookup
245
00:10:24,000 --> 00:10:27,000
and Territory Lookup data sets.
246
00:10:27,000 --> 00:10:29,000
We can head over into the Data view,
247
00:10:29,000 --> 00:10:31,000
and same thing, we can see the Product Lookup
248
00:10:34,000 --> 00:10:36,000
and Territory Lookup data sets here.
249
00:10:37,000 --> 00:10:40,000
And then if we jump over into the Data Model view,
250
00:10:40,000 --> 00:10:43,000
I will scroll out real quick so we can see both tables.
251
00:10:43,000 --> 00:10:45,000
And within the Data Model view,
252
00:10:45,000 --> 00:10:47,000
we can see that we have added both the Territory Lookup
253
00:10:47,000 --> 00:10:50,000
and Product Lookup tables here.
254
00:10:50,000 --> 00:10:52,000
All right, congratulations,
255
00:10:52,000 --> 00:10:54,000
you've just loaded your first two tables
256
00:10:54,000 --> 00:10:56,000
into your data model.
257
00:10:56,000 --> 00:11:00,000
From here, our last step is to really just save our work.
258
00:11:00,000 --> 00:11:02,000
And from here, you're good to go.
20660
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.