Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
0
00:00:05,318 --> 00:00:09,260
Today, Business Intelligence is a recently well-understood term.
1
00:00:09,260 --> 00:00:11,890
However, let's begin with the definition that's been sourced
2
00:00:11,890 --> 00:00:13,240
from Gartner.
3
00:00:13,240 --> 00:00:15,740
Now they define Business Intelligence as a broad category
4
00:00:15,740 --> 00:00:18,760
of applications and technologies for gathering, storing,
5
00:00:18,760 --> 00:00:22,360
analyzing, sharing, and providing access to data to help
6
00:00:22,360 --> 00:00:26,000
enterprise users make better business decisions.
7
00:00:26,000 --> 00:00:29,010
Now, I've worked in the industry for well over 15 years, and
8
00:00:29,010 --> 00:00:32,560
I certainly wouldn't argue with that definition.
9
00:00:32,560 --> 00:00:35,090
I may suggest that it could be put more simply,
10
00:00:35,090 --> 00:00:38,440
and that is that it could be described as the application of
11
00:00:38,440 --> 00:00:41,770
knowledge derived from analyzing the business' data
12
00:00:41,770 --> 00:00:44,080
to effect a more positive outcome.
13
00:00:44,080 --> 00:00:47,298
And arguably, this could be put more simply.
14
00:00:47,298 --> 00:00:50,807
And so I'll land at this simple definition that is about
15
00:00:50,807 --> 00:00:54,915
the transformation of your data assets into knowledge to support
16
00:00:54,915 --> 00:00:58,586
the decisions that are being made by your business users.
17
00:00:58,586 --> 00:01:02,438
Let's understand then how BI is used by the decision makers or
18
00:01:02,438 --> 00:01:03,800
your users.
19
00:01:03,800 --> 00:01:06,480
Typically like placing a finger on the pulse,
20
00:01:06,480 --> 00:01:08,510
it's about understanding the health.
21
00:01:08,510 --> 00:01:09,922
And when it comes to the health of a business,
22
00:01:09,922 --> 00:01:13,080
you'll find that data is the blood of the business.
23
00:01:13,080 --> 00:01:15,940
And therefore, we'll look to the data to help us understand
24
00:01:15,940 --> 00:01:18,700
the whats and hows of what's going on.
25
00:01:18,700 --> 00:01:21,120
What were the sales and how are they
26
00:01:21,120 --> 00:01:24,620
in comparison to the goals and targets that we've set?
27
00:01:24,620 --> 00:01:28,198
Next, the collaboration on a shared view of not just data but
28
00:01:28,198 --> 00:01:29,539
also business logic.
29
00:01:29,539 --> 00:01:32,239
And so I'll ask you to consider the scenario that you're in
30
00:01:32,239 --> 00:01:33,538
a meeting with colleagues.
31
00:01:33,538 --> 00:01:36,452
You are referring to last month's profitability and
32
00:01:36,452 --> 00:01:39,580
you understand that you have different numbers.
33
00:01:39,580 --> 00:01:42,910
And so you spend not so productive time in a meeting
34
00:01:42,910 --> 00:01:45,510
determining who is correct and who is wrong.
35
00:01:45,510 --> 00:01:48,010
Well, the reasons for this, and it's not so
36
00:01:48,010 --> 00:01:50,810
uncommon today, is that the way that
37
00:01:50,810 --> 00:01:54,320
the profit was being calculated was perhaps incorrect.
38
00:01:54,320 --> 00:01:56,790
One was looking at a definition of gross profit, or
39
00:01:56,790 --> 00:02:00,280
one was looking at the definition of net profit.
40
00:02:00,280 --> 00:02:02,560
It could also be that the data sources, or
41
00:02:02,560 --> 00:02:05,480
even the definition of time itself, was it a fiscal period,
42
00:02:05,480 --> 00:02:07,410
was it a calendar period?
43
00:02:07,410 --> 00:02:10,660
And so we often refer to the terminology or the objective
44
00:02:10,660 --> 00:02:14,528
in business intelligence of a single version of the truth.
45
00:02:14,528 --> 00:02:17,300
Now while I'll share with you that that's rather difficult
46
00:02:17,300 --> 00:02:20,098
to achieve, we certainly strive towards.
47
00:02:20,098 --> 00:02:24,700
In doing so, we eliminate or tempt to eliminate
48
00:02:24,700 --> 00:02:27,800
the duplication of data and the duplication of business logic
49
00:02:27,800 --> 00:02:29,440
and also the duplication of ethic.
50
00:02:30,490 --> 00:02:33,350
Lastly, and increasingly important these days,
51
00:02:33,350 --> 00:02:36,150
is the ability to reduce the time to decision.
52
00:02:36,150 --> 00:02:38,780
Business users don't wanna learn today that
53
00:02:38,780 --> 00:02:41,900
two weeks ago they were running low on an important ingredient
54
00:02:41,900 --> 00:02:43,840
in the manufacturing process.
55
00:02:43,840 --> 00:02:48,072
They need up-to-date data, even up to the second.
56
00:02:48,072 --> 00:02:50,659
The goal then of business intelligence is often to
57
00:02:50,659 --> 00:02:52,280
do things better.
58
00:02:52,280 --> 00:02:56,270
And therefore, we should expect it to impact on the bottom line
59
00:02:56,270 --> 00:02:59,400
by improving the way that we measure.
60
00:02:59,400 --> 00:03:02,190
And it can also come down to enhancing competitive advantage.
61
00:03:02,190 --> 00:03:06,110
Consider this, if your competitors are successfully
62
00:03:06,110 --> 00:03:08,120
implementing business intelligence and
63
00:03:08,120 --> 00:03:11,350
transforming their data into effective knowledge to support
64
00:03:11,350 --> 00:03:14,920
good business decisions, then they indeed have a competitive
65
00:03:14,920 --> 00:03:18,360
advantage over you if you're not achieving the same.
66
00:03:18,360 --> 00:03:21,410
So I will stress, for some organizations today
67
00:03:21,410 --> 00:03:24,000
they continue to consider that business intelligence is
68
00:03:24,000 --> 00:03:26,300
an afterthought or a lower priority.
69
00:03:26,300 --> 00:03:29,930
We would stress regardless of size of business, whether it's
70
00:03:29,930 --> 00:03:33,070
small, medium or large, that business intelligence should be
71
00:03:33,070 --> 00:03:35,700
considered an essential part of the IT portfolio.
72
00:03:36,780 --> 00:03:38,880
Now as an experienced professional,
73
00:03:38,880 --> 00:03:41,020
delivering business intelligence solutions,
74
00:03:41,020 --> 00:03:44,290
I can attest to the fact that solutions encompass and require
75
00:03:44,290 --> 00:03:48,880
an understanding to implement broad spectrums of technologies.
76
00:03:48,880 --> 00:03:50,000
And in this training course,
77
00:03:50,000 --> 00:03:53,150
we'll be exploring data warehousing and the ecosystem
78
00:03:53,150 --> 00:03:56,970
that supports the delivery of the enterprise data warehouse.
79
00:03:56,970 --> 00:03:59,950
Let's then consider the perspective from the business
80
00:03:59,950 --> 00:04:03,550
users and the types of questions they ask and the challenges that
81
00:04:03,550 --> 00:04:06,580
we may have in delivering responses to these questions.
82
00:04:06,580 --> 00:04:08,650
The first is reasonably straight forward.
83
00:04:08,650 --> 00:04:11,288
What sales have been made, and where?
84
00:04:11,288 --> 00:04:15,340
With a sale system, we're likely able to group by,
85
00:04:15,340 --> 00:04:18,610
filter summarized to answer this question.
86
00:04:18,610 --> 00:04:20,560
When it comes to the second example here of
87
00:04:20,560 --> 00:04:23,580
the salespeople's performance, it implies that there is some
88
00:04:23,580 --> 00:04:27,900
target or goal to measure the salespeople activity against.
89
00:04:27,900 --> 00:04:31,150
So there would be an expectation from my standing that
90
00:04:31,150 --> 00:04:34,490
there would be planning systems with approved values
91
00:04:34,490 --> 00:04:36,910
ensuring the ability to compare and
92
00:04:36,910 --> 00:04:39,550
measure performance in future periods.
93
00:04:39,550 --> 00:04:42,650
Next, which customers are likely to buy from us?
94
00:04:42,650 --> 00:04:47,100
And so this isn't a query that you could easily write against
95
00:04:47,100 --> 00:04:48,300
an operational system.
96
00:04:49,950 --> 00:04:52,690
The customers that are likely to buy from us, it implies that
97
00:04:52,690 --> 00:04:56,310
there are patterns and customers typically defined in terms of
98
00:04:56,310 --> 00:04:59,600
their demographics like age, location, gender.
99
00:05:00,780 --> 00:05:03,450
We would be able to detect patterns
100
00:05:03,450 --> 00:05:07,510
if we could use technology to understand what has happened,
101
00:05:07,510 --> 00:05:10,120
what have customers purchasing patterns been?
102
00:05:10,120 --> 00:05:13,580
And therefore if these patterns can be revealed from data,
103
00:05:13,580 --> 00:05:16,850
then it's likely that we could predict from those patterns,
104
00:05:16,850 --> 00:05:19,640
with a reasonable degree of accuracy.
105
00:05:19,640 --> 00:05:21,360
Which products do our customer buy together?
106
00:05:21,360 --> 00:05:25,110
Here's another example that analyzing the relationships
107
00:05:25,110 --> 00:05:29,120
between data, purchases, browsing.
108
00:05:29,120 --> 00:05:31,640
And commonly used in online retail today that when I'm
109
00:05:31,640 --> 00:05:34,130
browsing for a product, I like to see useful and
110
00:05:34,130 --> 00:05:37,480
relevant suggestions to entice me to buy more.
111
00:05:37,480 --> 00:05:38,270
What drives this?
112
00:05:38,270 --> 00:05:41,390
Is it a simple query, or as is the case,
113
00:05:41,390 --> 00:05:46,330
is a deeper process in place to deliver this question's answer?
114
00:05:46,330 --> 00:05:49,940
Lastly, what is the customers sentiment of the new product?
115
00:05:49,940 --> 00:05:52,563
So increasingly with social avenues,
116
00:05:52,563 --> 00:05:56,702
people are liking things, people are commenting on things.
117
00:05:56,702 --> 00:06:00,086
And now there's a need to aggregate data from a vast
118
00:06:00,086 --> 00:06:03,945
variety and formats representing people's opinions and
119
00:06:03,945 --> 00:06:07,724
thoughts and attitudes and somehow producing a response
120
00:06:07,724 --> 00:06:11,850
that tells us what people feel about our new product.
121
00:06:11,850 --> 00:06:15,800
Increasingly, as the questions become more complex,
122
00:06:15,800 --> 00:06:17,260
it delivers more challenges for
123
00:06:17,260 --> 00:06:19,420
us in delivering business intelligence.
124
00:06:19,420 --> 00:06:23,026
So common challenges that we'll say up front,
125
00:06:23,026 --> 00:06:27,817
typically, today involving volume, variety and velocity.
126
00:06:27,817 --> 00:06:30,819
We have enormous systems collecting
127
00:06:30,819 --> 00:06:33,840
enormous data at enormous rights.
128
00:06:33,840 --> 00:06:36,630
And it's not all conveniently in relational stores that makes
129
00:06:36,630 --> 00:06:38,000
my drive easier.
130
00:06:38,000 --> 00:06:39,680
It could also be in file format.
131
00:06:39,680 --> 00:06:42,280
It could also be in unstructured format.
132
00:06:43,300 --> 00:06:46,620
It could also reside, not just conveniently on premises, but
133
00:06:46,620 --> 00:06:48,210
it might also be in Cloud,
134
00:06:48,210 --> 00:06:52,450
whether it's My Cloud Services or whether it's a software as
135
00:06:52,450 --> 00:06:54,580
a service provider managing my data.
136
00:06:55,588 --> 00:06:58,187
Now from a business user looking to answer questions
137
00:06:58,187 --> 00:07:01,435
from the data, it may help be so straightforward, the volumes,
138
00:07:01,435 --> 00:07:03,042
variety, and velocity aside.
139
00:07:03,042 --> 00:07:04,560
How can this be easily queried?
140
00:07:04,560 --> 00:07:07,380
If the data resides in a series of files,
141
00:07:07,380 --> 00:07:08,830
how can a business user query that?
142
00:07:08,830 --> 00:07:11,630
That's challenging even for me as an IT professional.
143
00:07:12,640 --> 00:07:15,780
If it is conveniently in a relational store, which often
144
00:07:15,780 --> 00:07:19,930
our operational data is, is it optimized for analytic queries?
145
00:07:19,930 --> 00:07:21,780
And in this training course we will be talking about
146
00:07:21,780 --> 00:07:22,590
optimization.
147
00:07:22,590 --> 00:07:25,770
And it's important to understand that BI drives a different
148
00:07:25,770 --> 00:07:28,250
workload against relational systems.
149
00:07:29,610 --> 00:07:32,690
The workload typically works like this, that when I look at
150
00:07:32,690 --> 00:07:37,040
a report that shows me products down the rows and the 12 months
151
00:07:37,040 --> 00:07:41,450
of the year, and I see the sales that each product by month.
152
00:07:41,450 --> 00:07:44,781
What isn't so easily understandable is it that there
153
00:07:44,781 --> 00:07:48,854
could be billions of detail rows that were required to aggregated
154
00:07:48,854 --> 00:07:50,864
to produce that simple matrix.
155
00:07:50,864 --> 00:07:53,938
What that requires then is analytic queries that can filter
156
00:07:53,938 --> 00:07:55,630
group by an aggregate.
157
00:07:55,630 --> 00:07:57,880
Now while relational systems can do this,
158
00:07:57,880 --> 00:08:02,020
the systems we employ to manage our sales,
159
00:08:02,020 --> 00:08:04,875
so these operational systems, they're not optimized.
160
00:08:04,875 --> 00:08:07,970
They're right intensive systems and yet this query
161
00:08:07,970 --> 00:08:10,750
would best be delivered through a read intensive system.
162
00:08:10,750 --> 00:08:13,700
While we can, it has negative impacts on performance for
163
00:08:13,700 --> 00:08:15,500
both the requesting user.
164
00:08:15,500 --> 00:08:18,250
But perhaps also for the operations of those inserting
165
00:08:18,250 --> 00:08:19,310
orders into the system.
166
00:08:20,660 --> 00:08:23,170
Next, we would consider do these systems contain the data we need
167
00:08:23,170 --> 00:08:24,090
by design?
168
00:08:24,090 --> 00:08:25,630
If someone comes to me looking for
169
00:08:25,630 --> 00:08:28,540
a report that correlates temperature to sales,
170
00:08:28,540 --> 00:08:30,990
that's a great question, and it's a valid question.
171
00:08:30,990 --> 00:08:33,920
But if we don't collect data around temperature then we're
172
00:08:33,920 --> 00:08:36,140
not in a position to answer that question.
173
00:08:36,140 --> 00:08:38,060
The other consideration is history.
174
00:08:38,060 --> 00:08:39,290
Operational systems for
175
00:08:39,290 --> 00:08:43,330
their own optimization reasons like to archive regularly.
176
00:08:43,330 --> 00:08:44,680
The smaller the sets of the data,
177
00:08:44,680 --> 00:08:47,180
the more efficient they can do their job.
178
00:08:47,180 --> 00:08:50,090
However, business intelligence loves history.
179
00:08:50,090 --> 00:08:52,910
We love to see trends across time.
180
00:08:52,910 --> 00:08:56,500
So, do these systems contain adequate volumes of data?
181
00:08:56,500 --> 00:08:59,440
Next we can consider historical context and I love to use
182
00:08:59,440 --> 00:09:03,350
the example of Jenny Jones, an employee at my company.
183
00:09:03,350 --> 00:09:05,640
And Jenny, well she gets married and
184
00:09:05,640 --> 00:09:07,720
she chooses to change her last name.
185
00:09:07,720 --> 00:09:11,030
So, an update in the HR system overwrites her last name from
186
00:09:11,030 --> 00:09:13,980
Jones to Smith and then I run historical reports to look
187
00:09:13,980 --> 00:09:15,760
at her sales activities of last year.
188
00:09:16,980 --> 00:09:19,760
Perhaps this isn't a problem when we see that Jenny Smith,
189
00:09:19,760 --> 00:09:22,900
with the new name had sales activities last year because we
190
00:09:22,900 --> 00:09:24,350
all know Jenny.
191
00:09:24,350 --> 00:09:28,370
But let me provide you a twist that Jenny Smith relocates
192
00:09:28,370 --> 00:09:32,470
between sales territories from Australia to New Zealand.
193
00:09:32,470 --> 00:09:35,744
And by updating a simple flag against the employee, we have
194
00:09:35,744 --> 00:09:38,966
shifted all of the historical sales to a new sales region.
195
00:09:38,966 --> 00:09:41,716
And clearly this is an undesirable from a reporting and
196
00:09:41,716 --> 00:09:43,700
analytics perspective.
197
00:09:43,700 --> 00:09:47,040
Operational systems rarely give consideration to the historical
198
00:09:47,040 --> 00:09:48,122
context of data.
199
00:09:48,122 --> 00:09:51,187
Lastly, are these systems available or accessible?
200
00:09:51,187 --> 00:09:53,220
So, numerous challenges.
201
00:09:53,220 --> 00:09:56,510
And then to move on from data challenges to human challenges,
202
00:09:56,510 --> 00:10:00,020
our business users ordinarily do not come from an IT department.
203
00:10:00,020 --> 00:10:04,027
So they typically don't have sufficient skills, tools,
204
00:10:04,027 --> 00:10:07,401
or even the permissions to access these systems.
205
00:10:07,401 --> 00:10:09,473
Self-service business intelligence will be
206
00:10:09,473 --> 00:10:11,660
a topic that comes up time to time.
207
00:10:11,660 --> 00:10:15,040
And we will address that some users are empowered to produce
208
00:10:15,040 --> 00:10:19,510
their own solutions, but others are reliant upon pre-delivered
209
00:10:19,510 --> 00:10:23,190
reports or exploration experiences through data models.
210
00:10:23,190 --> 00:10:26,430
Lastly and in reference to my example of the profitability and
211
00:10:26,430 --> 00:10:28,750
the conflicts we had in a meeting,
212
00:10:28,750 --> 00:10:31,430
systems may not have consistent definitions.
213
00:10:31,430 --> 00:10:35,339
So quering across systems provides ambiguities and
214
00:10:35,339 --> 00:10:36,591
inaccuracies.
215
00:10:36,591 --> 00:10:40,180
Now, our decision makers then, have common requirements.
216
00:10:40,180 --> 00:10:42,071
They need to be able to discover and find data.
217
00:10:42,071 --> 00:10:45,110
It needs to reliable and secure.
218
00:10:45,110 --> 00:10:46,614
They need flexibility.
219
00:10:46,614 --> 00:10:49,436
And the way I like to describe this is that in organisations
220
00:10:49,436 --> 00:10:51,640
today, it's typically a pyramid.
221
00:10:51,640 --> 00:10:54,692
You can consider at the very apex you have your executives in
222
00:10:54,692 --> 00:10:55,675
C level.
223
00:10:55,675 --> 00:11:00,192
Now commonly, these individuals are driven by dashboards.
224
00:11:00,192 --> 00:11:02,760
They wanna see numbers, colors, arrows,
225
00:11:02,760 --> 00:11:05,470
trends, and where things are off track,
226
00:11:05,470 --> 00:11:09,660
they would like to drill in and understand chords.
227
00:11:09,660 --> 00:11:12,030
Now when we think of the same organization chart,
228
00:11:12,030 --> 00:11:15,340
those at the lower levels typically process workers.
229
00:11:15,340 --> 00:11:18,360
These also are people that have business requirements to
230
00:11:18,360 --> 00:11:21,760
ask questions and use data to deliver their answers.
231
00:11:21,760 --> 00:11:24,390
But what you'll find is that process workers typically have
232
00:11:24,390 --> 00:11:26,970
repetitive and recurring questions and
233
00:11:26,970 --> 00:11:29,180
as such fixed reports work very well for them.
234
00:11:29,180 --> 00:11:32,590
Now what interests me is the level in between,
235
00:11:32,590 --> 00:11:35,010
which are more like your analysts and power users, and
236
00:11:35,010 --> 00:11:39,630
these people often work on ad hoc custom requirements.
237
00:11:39,630 --> 00:11:41,560
And they might work with tools like Excel and
238
00:11:41,560 --> 00:11:44,200
produce rather complex solutions.
239
00:11:44,200 --> 00:11:46,940
And so what I'm demonstrating here is that different users all
240
00:11:46,940 --> 00:11:50,620
having valid questions driven by data have the need for
241
00:11:50,620 --> 00:11:52,590
flexibility in the way they access or
242
00:11:52,590 --> 00:11:55,690
even create their own solutions.
243
00:11:55,690 --> 00:11:57,300
Low latency has already been brought up.
244
00:11:57,300 --> 00:12:00,490
Increasingly we want real time data and certainly decision
245
00:12:00,490 --> 00:12:02,590
makers and business users need tools and training.
246
00:12:03,670 --> 00:12:05,800
Where I'd like to finish up in this topic, then,
247
00:12:05,800 --> 00:12:08,890
are to consider delivery scenarios.
248
00:12:08,890 --> 00:12:12,060
Let's begin then with Operational Reporting.
249
00:12:12,060 --> 00:12:14,280
I've already mentioned this,
250
00:12:14,280 --> 00:12:17,550
that typically operational systems will have a library
251
00:12:17,550 --> 00:12:20,280
of reports that drive the day-to-day operations.
252
00:12:20,280 --> 00:12:20,990
For example,
253
00:12:20,990 --> 00:12:24,730
in a sale system, we're likely to have an invoice report.
254
00:12:24,730 --> 00:12:29,321
This is not such a bad thing, however, if these reports
255
00:12:29,321 --> 00:12:33,731
become more complex or more demanding of resources.
256
00:12:33,731 --> 00:12:36,986
For example, the need to aggregate billions of rows of
257
00:12:36,986 --> 00:12:40,030
data to produce that simple metrics of products and
258
00:12:40,030 --> 00:12:41,746
their sales across months.
259
00:12:41,746 --> 00:12:44,707
Then these will impact negatively on the performance of
260
00:12:44,707 --> 00:12:46,875
everybody's experience.
261
00:12:46,875 --> 00:12:50,995
So we might consider moving up a notch and producing
262
00:12:50,995 --> 00:12:54,035
a business intelligence delivery scenario around a particular
263
00:12:54,035 --> 00:12:57,975
business process, maybe the budgeting process in finance.
264
00:12:57,975 --> 00:12:59,685
Let's produce experiences,
265
00:12:59,685 --> 00:13:04,099
reports and analytic solutions to drive that business process.
266
00:13:05,240 --> 00:13:08,060
Moving up another notch, we have the Data Mart.
267
00:13:08,060 --> 00:13:12,080
And by definition, this is the integration of potential
268
00:13:12,080 --> 00:13:16,330
multiple stores to provide a subject specific area for
269
00:13:16,330 --> 00:13:18,890
the support of analysis and reporting.
270
00:13:18,890 --> 00:13:22,441
It could integrate, for example, a finance and GL in planning
271
00:13:22,441 --> 00:13:25,791
system and, therefore, with an integrated set of data and
272
00:13:25,791 --> 00:13:28,350
single version of the truth business logic.
273
00:13:28,350 --> 00:13:31,230
The finance people have a place to go
274
00:13:31,230 --> 00:13:33,930
to answer their questions from the data mart.
275
00:13:35,776 --> 00:13:38,360
Now we arrive then, at the enterprise data warehouse,
276
00:13:38,360 --> 00:13:40,980
which in fact is the focus of this training course and more
277
00:13:40,980 --> 00:13:44,830
specifically, the implementation of relational data warehousing.
278
00:13:44,830 --> 00:13:46,810
If you can imagine the overtime,
279
00:13:46,810 --> 00:13:49,400
the accumulation of these data marks,
280
00:13:49,400 --> 00:13:54,470
these subjects specific stores around HR, sales, finance.
281
00:13:54,470 --> 00:13:57,590
And designed in such a way that they're integrated, and
282
00:13:57,590 --> 00:14:01,630
they're conformed to work with consistent definitions like
283
00:14:01,630 --> 00:14:04,170
time, product, employee.
284
00:14:04,170 --> 00:14:07,300
What you're building then is the enterprise data warehouse to
285
00:14:07,300 --> 00:14:08,700
support the integration and
286
00:14:08,700 --> 00:14:12,820
access of critical information systems by business users.
287
00:14:12,820 --> 00:14:15,402
And that very much sets the focus of this course, and
288
00:14:15,402 --> 00:14:16,583
concludes this topic.
23801
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.