Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
1
00:00:00,000 --> 00:00:01,000
Instructor: Okay, so by this point,
2
00:00:01,000 --> 00:00:03,000
you should be pretty comfortable
3
00:00:03,000 --> 00:00:05,000
with the basic Query editing tools,
4
00:00:05,000 --> 00:00:08,000
your date tools and text tools and number tools.
5
00:00:08,000 --> 00:00:09,000
And for the most part,
6
00:00:09,000 --> 00:00:10,000
they've all been pretty straightforward
7
00:00:10,000 --> 00:00:12,000
and pretty intuitive,
8
00:00:12,000 --> 00:00:14,000
but there are some pretty interesting,
9
00:00:14,000 --> 00:00:16,000
maybe not so intuitive tools
10
00:00:16,000 --> 00:00:18,000
that are available within the Query Editor.
11
00:00:18,000 --> 00:00:20,000
And one thing that I want to cover now
12
00:00:20,000 --> 00:00:24,000
is this idea of pivoting and unpivoting a table.
13
00:00:24,000 --> 00:00:26,000
So the way I describe it,
14
00:00:26,000 --> 00:00:29,000
pivoting is just a fancy way to describe the process
15
00:00:29,000 --> 00:00:32,000
of turning distinct row values into columns,
16
00:00:32,000 --> 00:00:34,000
which is called pivoting,
17
00:00:34,000 --> 00:00:36,000
or turning columns into rows,
18
00:00:36,000 --> 00:00:38,000
which is called unpivoting.
19
00:00:38,000 --> 00:00:40,000
Now, it sounds pretty simple on the surface
20
00:00:40,000 --> 00:00:44,000
but it's actually a little bit tricky to get the hang of,
21
00:00:44,000 --> 00:00:45,000
at least it was for me.
22
00:00:45,000 --> 00:00:47,000
And I'm a visual learner
23
00:00:47,000 --> 00:00:48,000
and what I'm about to show you
24
00:00:48,000 --> 00:00:52,000
is what finally caused it to click for me.
25
00:00:52,000 --> 00:00:54,000
So imagine you've got this simple table.
26
00:00:54,000 --> 00:00:55,000
You've got dates in the first column,
27
00:00:55,000 --> 00:00:57,000
followed by product categories,
28
00:00:57,000 --> 00:00:59,000
then region, and quantity sold.
29
00:00:59,000 --> 00:01:03,000
If we were to select the Region and Quantity Sold columns
30
00:01:03,000 --> 00:01:04,000
and then pivot this table,
31
00:01:04,000 --> 00:01:08,000
we would essentially take the region in each row
32
00:01:08,000 --> 00:01:10,000
and turn it into its own column.
33
00:01:10,000 --> 00:01:13,000
So instead of having a row for North Region,
34
00:01:13,000 --> 00:01:16,000
we would have a column
35
00:01:16,000 --> 00:01:19,000
and the same applies for Central and South Regions too.
36
00:01:19,000 --> 00:01:21,000
So what we've basically done here
37
00:01:21,000 --> 00:01:24,000
is flip that table from a vertical format
38
00:01:24,000 --> 00:01:27,000
up into a horizontal format.
39
00:01:27,000 --> 00:01:30,000
We've turned rows into columns.
40
00:01:30,000 --> 00:01:31,000
Now on the flip side,
41
00:01:31,000 --> 00:01:35,000
if we started out with the table in that horizontal format
42
00:01:35,000 --> 00:01:37,000
and we selected the regions in the first row,
43
00:01:37,000 --> 00:01:42,000
we could unpivot to transform it into that vertical form.
44
00:01:42,000 --> 00:01:45,000
Now, in other words, we're taking those regions,
45
00:01:45,000 --> 00:01:46,000
which had been columns,
46
00:01:46,000 --> 00:01:49,000
and transform them into rows.
47
00:01:49,000 --> 00:01:51,000
So my little mental tip here
48
00:01:51,000 --> 00:01:54,000
is that I imagine that the table is kind of like on a hinge
49
00:01:54,000 --> 00:01:56,000
in that upper left hand corner.
50
00:01:56,000 --> 00:02:00,000
Pivoting is like rotating it up from vertical to horizontal
51
00:02:00,000 --> 00:02:03,000
and unpivoting is like rotating down
52
00:02:03,000 --> 00:02:06,000
from horizontal to vertical.
53
00:02:06,000 --> 00:02:08,000
Now, one thing to note here,
54
00:02:08,000 --> 00:02:10,000
there's actually another tool in the Query Editor
55
00:02:10,000 --> 00:02:11,000
called Transpose
56
00:02:11,000 --> 00:02:14,000
and that works in a really similar way
57
00:02:14,000 --> 00:02:16,000
but the difference is that it doesn't recognize
58
00:02:16,000 --> 00:02:17,000
unique values.
59
00:02:17,000 --> 00:02:19,000
So if you're dealing with a table
60
00:02:19,000 --> 00:02:20,000
like the one we're looking at here,
61
00:02:20,000 --> 00:02:23,000
where we have duplicate dates and quantities,
62
00:02:23,000 --> 00:02:27,000
pivoting and transposing would yield very different results.
63
00:02:27,000 --> 00:02:28,000
So one way to think about that
64
00:02:28,000 --> 00:02:30,000
is that Transpose is kind of clunky.
65
00:02:30,000 --> 00:02:33,000
It's kind of like using brute force.
66
00:02:33,000 --> 00:02:35,000
It just takes an entire table
67
00:02:35,000 --> 00:02:37,000
and flips it up on its side, right?
68
00:02:37,000 --> 00:02:40,000
So you would have duplicates for each date
69
00:02:40,000 --> 00:02:43,000
and category and so on and so forth.
70
00:02:43,000 --> 00:02:46,000
So I actually have a little demo file here
71
00:02:46,000 --> 00:02:47,000
that I'm gonna show you
72
00:02:47,000 --> 00:02:49,000
because you really do have to see this happening
73
00:02:49,000 --> 00:02:50,000
in real time
74
00:02:50,000 --> 00:02:53,000
in order to kind of understand and grasp what's going on.
75
00:02:53,000 --> 00:02:55,000
So let's jump over to Power BI
76
00:02:55,000 --> 00:02:57,000
and we'll show you what this looks like.
77
00:02:58,000 --> 00:03:01,000
All right, so we're gonna go back up
78
00:03:01,000 --> 00:03:03,000
and we are gonna grab a new data set,
79
00:03:03,000 --> 00:03:05,000
and there is a data set
80
00:03:05,000 --> 00:03:09,000
called Product Category Sales Unpivot Demo,
81
00:03:09,000 --> 00:03:11,000
and you can open this one to kind of follow along
82
00:03:11,000 --> 00:03:13,000
and play around with the data set on your own,
83
00:03:13,000 --> 00:03:14,000
if you're so inclined.
84
00:03:16,000 --> 00:03:20,000
All right, we get our preview and we'll add this in.
85
00:03:20,000 --> 00:03:21,000
Again, for the sake of the demo,
86
00:03:21,000 --> 00:03:24,000
we don't really need to update the name or data types
87
00:03:24,000 --> 00:03:25,000
or anything like that.
88
00:03:25,000 --> 00:03:27,000
If there's any errors here, you could address those,
89
00:03:27,000 --> 00:03:29,000
but everything should be good to go.
90
00:03:29,000 --> 00:03:31,000
All right, so you can see here
91
00:03:31,000 --> 00:03:35,000
we're actually starting off with this table pivoted
92
00:03:35,000 --> 00:03:38,000
and let's say that we were given this table
93
00:03:38,000 --> 00:03:41,000
by a team of sales reps that had been collecting sales data
94
00:03:41,000 --> 00:03:44,000
for each of their regions by product and date
95
00:03:44,000 --> 00:03:46,000
and they combined all of their work
96
00:03:46,000 --> 00:03:48,000
into a single data set, right?
97
00:03:48,000 --> 00:03:50,000
And they give it to you and they say,
98
00:03:50,000 --> 00:03:52,000
"Hey, we need some help with some data visualization here.
99
00:03:52,000 --> 00:03:57,000
And we really want to be able to filter all of our sales
100
00:03:57,000 --> 00:03:59,000
by date and product and region."
101
00:03:59,000 --> 00:04:02,000
And the way that this datasets configured,
102
00:04:02,000 --> 00:04:06,000
it doesn't make that very easy to add a Region filter.
103
00:04:06,000 --> 00:04:07,000
So as an analyst,
104
00:04:07,000 --> 00:04:10,000
as someone who needs to take this data and interpret it
105
00:04:10,000 --> 00:04:12,000
and analyze it and explore it,
106
00:04:12,000 --> 00:04:13,000
this type of table format
107
00:04:13,000 --> 00:04:16,000
can lead to a bunch of different headaches.
108
00:04:16,000 --> 00:04:18,000
And really what we're looking for here
109
00:04:18,000 --> 00:04:20,000
is a rectangular table
110
00:04:20,000 --> 00:04:23,000
with each dimension or metric as its own column
111
00:04:23,000 --> 00:04:26,000
and each observation as a row.
112
00:04:26,000 --> 00:04:28,000
So ideally, what we're looking for here
113
00:04:28,000 --> 00:04:31,000
is to transform this whole table
114
00:04:31,000 --> 00:04:33,000
into a four-column table
115
00:04:33,000 --> 00:04:35,000
that contains date, product category,
116
00:04:35,000 --> 00:04:36,000
a column for region,
117
00:04:36,000 --> 00:04:39,000
and then another column for my sales quantity.
118
00:04:39,000 --> 00:04:41,000
Once we get it in that format,
119
00:04:41,000 --> 00:04:43,000
that would give me a form
120
00:04:43,000 --> 00:04:45,000
that I could then plug into my data model
121
00:04:45,000 --> 00:04:47,000
and analyze in any way that I choose.
122
00:04:47,000 --> 00:04:49,000
So let's take a look
123
00:04:49,000 --> 00:04:51,000
at some of these pivoting and unpivoting tools
124
00:04:51,000 --> 00:04:53,000
to see if they can help us.
125
00:04:53,000 --> 00:04:57,000
All right, so because we're starting in this pivoted mode,
126
00:04:57,000 --> 00:04:58,000
what we want to do
127
00:04:58,000 --> 00:05:00,000
is we want to unpivot these columns, right?
128
00:05:00,000 --> 00:05:05,000
So I'm gonna select my different regions here
129
00:05:06,000 --> 00:05:08,000
and we'll come up to Transform,
130
00:05:08,000 --> 00:05:10,000
and you can see we have Unpivot.
131
00:05:10,000 --> 00:05:12,000
I've got a few options here,
132
00:05:12,000 --> 00:05:14,000
Unpivot Columns, Unpivot Other Columns,
133
00:05:14,000 --> 00:05:17,000
Unpivot Only Selected Columns.
134
00:05:17,000 --> 00:05:20,000
And because we've selected the columns we want to unpivot,
135
00:05:20,000 --> 00:05:23,000
we'll click Unpivot Columns, right?
136
00:05:23,000 --> 00:05:26,000
And what this does is it creates an Attribute column
137
00:05:26,000 --> 00:05:28,000
and then a Value column, right?
138
00:05:28,000 --> 00:05:31,000
So from here, we could name this Region, right,
139
00:05:33,000 --> 00:05:37,000
and this could be, you know, Quantity Sold.
140
00:05:40,000 --> 00:05:40,000
So very quickly,
141
00:05:40,000 --> 00:05:44,000
we've been able to get our data set in a format
142
00:05:44,000 --> 00:05:48,000
that lends itself a lot better to being analyzed.
143
00:05:48,000 --> 00:05:51,000
Now we could go back to the pivoted version.
144
00:05:51,000 --> 00:05:53,000
All right, so we select our two columns,
145
00:05:53,000 --> 00:05:55,000
we'll come up here to Pivot,
146
00:05:55,000 --> 00:05:57,000
and from our dialogue,
147
00:05:57,000 --> 00:06:00,000
use the names in the column Region to create new columns.
148
00:06:00,000 --> 00:06:03,000
Our value column is our Quantity Sold,
149
00:06:03,000 --> 00:06:04,000
so we're good there.
150
00:06:04,000 --> 00:06:05,000
We'll click OK, right,
151
00:06:05,000 --> 00:06:09,000
and then we're back to that pivoted version of the table.
152
00:06:09,000 --> 00:06:11,000
So again, pretty straightforward demo here
153
00:06:11,000 --> 00:06:14,000
when you've got a table that contains columns like this
154
00:06:14,000 --> 00:06:17,000
and you need to get it into a little bit better of a format.
155
00:06:17,000 --> 00:06:18,000
What we can do here
156
00:06:18,000 --> 00:06:22,000
is we can delete these last applied steps here
157
00:06:22,000 --> 00:06:26,000
to return the table back to its original format, right?
158
00:06:26,000 --> 00:06:28,000
And the last thing that I want to show you here
159
00:06:28,000 --> 00:06:30,000
is that the Transpose option
160
00:06:30,000 --> 00:06:33,000
that we talked about in the slides here, right,
161
00:06:33,000 --> 00:06:35,000
again, like that's this brute force option.
162
00:06:35,000 --> 00:06:37,000
You take a look at our rows here, right?
163
00:06:37,000 --> 00:06:42,000
We've got four different rows for July 1st,
164
00:06:42,000 --> 00:06:45,000
we've got four different rows for July 2nd,
165
00:06:45,000 --> 00:06:48,000
and what happens is it retains everything,
166
00:06:48,000 --> 00:06:49,000
it doesn't group it.
167
00:06:49,000 --> 00:06:52,000
So if we click Transpose,
168
00:06:53,000 --> 00:06:56,000
you can see here that we've got four columns
169
00:06:56,000 --> 00:06:58,000
for July 1st, right?
170
00:06:58,000 --> 00:07:01,000
For bike, components, clothing, accessories, right?
171
00:07:01,000 --> 00:07:06,000
If we scroll over, we've got four rows for the second.
172
00:07:06,000 --> 00:07:09,000
Again, bike, component, clothing, accessories.
173
00:07:09,000 --> 00:07:13,000
So again, using transpose in this case doesn't make sense,
174
00:07:13,000 --> 00:07:14,000
'cause again it's gonna split these all
175
00:07:14,000 --> 00:07:16,000
into their own columns.
176
00:07:16,000 --> 00:07:17,000
And again, that's not the desired state
177
00:07:17,000 --> 00:07:19,000
that we're looking for.
178
00:07:20,000 --> 00:07:22,000
All right, so there you have it.
179
00:07:22,000 --> 00:07:25,000
That's your demo on pivoting and unpivoting a data set.
13814
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.