Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
1
00:00:01,350 --> 00:00:08,370
Hello and a very warm welcome to the data scientist course too many people data science and machine
2
00:00:08,370 --> 00:00:15,920
learning feel like unknown territory something too abstract and complex to show you that isn't the case.
3
00:00:15,990 --> 00:00:21,660
I would like to start by giving you a hands on example of what to expect from this course and what you
4
00:00:21,660 --> 00:00:23,210
will be able to do on your own.
5
00:00:23,280 --> 00:00:29,710
Only a few hours into our training there are three key words that will come up a lot during the lecture
6
00:00:29,920 --> 00:00:34,010
data algorithm and insight here.
7
00:00:34,190 --> 00:00:41,150
We have data collected from the customers of a shop 30 observations in total each observation represents
8
00:00:41,150 --> 00:00:45,730
a client who shared their customer satisfaction and brand loyalty.
9
00:00:45,740 --> 00:00:53,330
Let's suppose the owners of the shop hired our consultancy firm to analyze customer behavior dividing
10
00:00:53,330 --> 00:00:59,270
the shops customer base into groups of individuals with similar traits is a great way to reduce complexity
11
00:00:59,390 --> 00:01:03,690
and come up with ideas on how to serve these customer groups better.
12
00:01:03,770 --> 00:01:09,860
And of course when their business in the long run to do that we will have to apply.
13
00:01:09,890 --> 00:01:13,010
Machine learning ready.
14
00:01:13,060 --> 00:01:13,980
Here we go.
15
00:01:14,170 --> 00:01:18,850
The dataset that we've got is already loaded in the variable data.
16
00:01:18,850 --> 00:01:24,440
A good preliminary step of most analyses is to visualize the data and examine it.
17
00:01:24,700 --> 00:01:31,250
One of the better tools to do that is a scatterplot how many groups of points can you see here.
18
00:01:32,080 --> 00:01:35,610
There are two groups standing out in data science.
19
00:01:35,680 --> 00:01:38,120
We would normally call these groups clusters.
20
00:01:38,260 --> 00:01:45,130
So two clusters can be identified instantly with no machine learning whatsoever one represents people
21
00:01:45,130 --> 00:01:50,550
with low loyalty and low satisfaction and the other one containing all the rest.
22
00:01:51,730 --> 00:01:57,170
Our preliminary visual examination shows us that there are some insights we can draw for sure.
23
00:01:57,430 --> 00:02:00,610
But let's take a more scientific approach.
24
00:02:01,550 --> 00:02:09,410
Most of the times in data science you would want to standardize your data next we will perform some
25
00:02:09,500 --> 00:02:16,880
unsupervised machine learning more specifically cluster analysis using the popular Kamins algorithm
26
00:02:17,090 --> 00:02:19,720
we will identify for clusters.
27
00:02:19,730 --> 00:02:24,470
The code which we'll examine in detail later on in the course looks like the following.
28
00:02:25,580 --> 00:02:27,690
And we are done.
29
00:02:27,810 --> 00:02:34,850
I can now plot the data using the predicted clusters as colors of the new scatterplot we've got the
30
00:02:34,850 --> 00:02:38,690
same scatterplot but with four clusters.
31
00:02:38,690 --> 00:02:40,880
Our customers have been segmented.
32
00:02:40,880 --> 00:02:47,900
From here we can distinguish four types of customers and actually name them the ones with the low satisfaction
33
00:02:47,900 --> 00:02:56,060
and low loyalty will be called alienated those with high satisfaction and high loyalty are fans.
34
00:02:56,060 --> 00:02:59,920
Those with low satisfaction and high loyalty are supporters.
35
00:03:00,140 --> 00:03:04,460
And the last ones that are neutral or disloyal but have a high satisfaction.
36
00:03:04,580 --> 00:03:09,040
These are roamers using just a few lines of code.
37
00:03:09,040 --> 00:03:11,470
We've reached a remarkable result.
38
00:03:11,470 --> 00:03:14,770
We have segmented our customers in four different groups.
39
00:03:14,800 --> 00:03:19,590
We've applied an algorithm on our data to reach an insight.
40
00:03:19,840 --> 00:03:23,590
Naturally we must analyze what we see data.
41
00:03:23,590 --> 00:03:27,820
Science is about storytelling and making sense of numbers.
42
00:03:27,820 --> 00:03:30,820
We have four groups but only one of them is favorable.
43
00:03:30,820 --> 00:03:37,450
The fans cluster analysis indicates the problem some customers are dissatisfied.
44
00:03:37,540 --> 00:03:39,210
Others are disloyal.
45
00:03:39,220 --> 00:03:44,660
However we must figure out how to solve the problem ourselves.
46
00:03:44,670 --> 00:03:48,740
What are some ideas a data scientist and management will come up with.
47
00:03:48,780 --> 00:03:55,130
It makes sense to focus our efforts to turn supporters into fans by improving their shopping experience.
48
00:03:55,140 --> 00:04:01,510
Normally we would have to dig deeper to find the drivers of dissatisfaction for these customers.
49
00:04:01,530 --> 00:04:07,230
Maybe it is long queues or unfriendly staff or perhaps high prices.
50
00:04:07,380 --> 00:04:13,630
Whatever the reason we must take actionable steps to fix the issue and make our supporters happier.
51
00:04:14,880 --> 00:04:17,980
Simultaneously we can do something else.
52
00:04:18,180 --> 00:04:25,760
We can turn the roamers into fans by increasing their brand loyalty loyalty cards gifts personalized
53
00:04:25,760 --> 00:04:31,790
discounts vouchers and raffles are different strategies used to make such clients loyal in the long
54
00:04:31,790 --> 00:04:34,590
run great.
55
00:04:34,600 --> 00:04:39,360
Please bear in mind that in this exercise we missed a few steps along the way.
56
00:04:39,430 --> 00:04:45,700
Typing code step by step creating a DeNiro Graham analyzing a heat map and finding the optimal number
57
00:04:45,700 --> 00:04:47,100
of clusters.
58
00:04:47,140 --> 00:04:51,530
However these are all topics we will address later on in the course.
59
00:04:52,000 --> 00:04:58,060
So let's begin acquiring the knowledge needed step by step until we are ready to gain insights from
60
00:04:58,060 --> 00:05:05,410
larger data sets with various algorithms so we can turn all types of data into actionable insights.6518
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.