Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
1
00:00:00,000 --> 00:00:01,000
Instructor: I'd like to take a moment
2
00:00:01,000 --> 00:00:04,000
and talk about some best practices when it comes to creating
3
00:00:04,000 --> 00:00:08,000
calculated columns or adding new features within a data set.
4
00:00:08,000 --> 00:00:10,000
So as a best practice, table transformations
5
00:00:10,000 --> 00:00:13,000
and column calculations should ideally happen
6
00:00:13,000 --> 00:00:16,000
as close to the original data source as possible.
7
00:00:16,000 --> 00:00:18,000
Now, the reason behind this best practice
8
00:00:18,000 --> 00:00:21,000
is that when calculations happen at the source,
9
00:00:21,000 --> 00:00:24,000
you're able to use the original data source engine
10
00:00:24,000 --> 00:00:27,000
like the SQL Server Engine or the MySQL Engine
11
00:00:27,000 --> 00:00:30,000
or Excel's workbook engine and not rely
12
00:00:30,000 --> 00:00:34,000
on Power BI's Engine to do these calculations.
13
00:00:34,000 --> 00:00:36,000
Basically, all of this boils down
14
00:00:36,000 --> 00:00:40,000
to optimizing Power BI reports for performance and speed.
15
00:00:40,000 --> 00:00:42,000
Here's a high level look at the different places
16
00:00:42,000 --> 00:00:43,000
where calculation should be made
17
00:00:43,000 --> 00:00:47,000
or added as part of a report development project.
18
00:00:47,000 --> 00:00:49,000
Ideally, new data features and columns are added
19
00:00:49,000 --> 00:00:52,000
as far upstream as possible, which means
20
00:00:52,000 --> 00:00:55,000
as close to the original data source as possible.
21
00:00:55,000 --> 00:00:58,000
And if that can't happen, then your next best option
22
00:00:58,000 --> 00:01:01,000
is to create calculated columns using the query editor,
23
00:01:01,000 --> 00:01:03,000
followed by the power BI front end
24
00:01:03,000 --> 00:01:06,000
and last in published reports or dashboards
25
00:01:06,000 --> 00:01:08,000
if that functionality exists.
26
00:01:08,000 --> 00:01:10,000
So at this point, you might be thinking
27
00:01:10,000 --> 00:01:12,000
that Power Query and Power BI's front end
28
00:01:12,000 --> 00:01:15,000
are all part of the same Power BI tool.
29
00:01:15,000 --> 00:01:18,000
Why is Power Query more efficient than the front end?
30
00:01:18,000 --> 00:01:20,000
And that's an awesome question.
31
00:01:20,000 --> 00:01:23,000
The reason is that Power BI's internal engine,
32
00:01:23,000 --> 00:01:26,000
called Vertipaq, creates something called the query plan
33
00:01:26,000 --> 00:01:28,000
when you press load and apply
34
00:01:28,000 --> 00:01:31,000
to load the data into Power BI's data model.
35
00:01:31,000 --> 00:01:34,000
And this query plan takes all of the raw source data
36
00:01:34,000 --> 00:01:38,000
transformation steps, any calculations that have been made,
37
00:01:38,000 --> 00:01:40,000
columns that have been added, et cetera,
38
00:01:40,000 --> 00:01:44,000
and it determines the best way to compress
39
00:01:44,000 --> 00:01:47,000
all of that data before loading it into Power BI's memory.
40
00:01:47,000 --> 00:01:50,000
Remember that lecture on storage and connection modes?
41
00:01:50,000 --> 00:01:53,000
When you're using import mode, the data is loaded
42
00:01:53,000 --> 00:01:57,000
into Power BI and the Veripaq engine needs to create
43
00:01:57,000 --> 00:02:00,000
this highly optimized plan for fast
44
00:02:00,000 --> 00:02:02,000
and reliable performance.
45
00:02:02,000 --> 00:02:04,000
So when new calculations, new columns
46
00:02:04,000 --> 00:02:08,000
and other data features are added using Power BI's front end
47
00:02:08,000 --> 00:02:10,000
they're not added to that same query plan
48
00:02:10,000 --> 00:02:14,000
and can't take advantage of the compression methods.
49
00:02:14,000 --> 00:02:16,000
So because of this, you might disproportionately
50
00:02:16,000 --> 00:02:19,000
bloat your data model size.
51
00:02:19,000 --> 00:02:21,000
Now, there is a lot of nuance involved here
52
00:02:21,000 --> 00:02:24,000
and it's generally outside the scope of this course,
53
00:02:24,000 --> 00:02:27,000
but just keep in mind that when possible,
54
00:02:27,000 --> 00:02:30,000
add columns at the source, if you can't,
55
00:02:30,000 --> 00:02:33,000
then try the query editor, if that doesn't work
56
00:02:33,000 --> 00:02:35,000
then use the front end tools in Power BI.
57
00:02:35,000 --> 00:02:38,000
One last call out here before we move on.
58
00:02:38,000 --> 00:02:41,000
This best practice isn't a strict requirement or rule
59
00:02:41,000 --> 00:02:42,000
but it's really something
60
00:02:42,000 --> 00:02:44,000
that can significantly impact performance
61
00:02:44,000 --> 00:02:47,000
for very large or complex data models.
62
00:02:47,000 --> 00:02:51,000
And I totally understand that where you define calculation
63
00:02:51,000 --> 00:02:53,000
often depends on several factors
64
00:02:53,000 --> 00:02:57,000
like accessibility to data sources, complexity,
65
00:02:57,000 --> 00:03:00,000
business requirements, your own ability
66
00:03:00,000 --> 00:03:02,000
and level within Power BI,
67
00:03:02,000 --> 00:03:04,000
so basically we'll be practicing different methods
68
00:03:04,000 --> 00:03:07,000
to create columns using both the query editor,
69
00:03:07,000 --> 00:03:09,000
like we've already done, and DAX,
70
00:03:09,000 --> 00:03:11,000
which is found in Power BI's front end.
5795
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.