All language subtitles for 001 A Practical Example What You Will Learn in This Course_en

af Afrikaans
sq Albanian
am Amharic
ar Arabic
hy Armenian
az Azerbaijani
eu Basque
be Belarusian
bn Bengali
bs Bosnian
bg Bulgarian
ca Catalan
ceb Cebuano
ny Chichewa
zh-CN Chinese (Simplified)
zh-TW Chinese (Traditional)
co Corsican
hr Croatian
cs Czech
da Danish
nl Dutch
en English
eo Esperanto
et Estonian
tl Filipino
fi Finnish
fr French Download
fy Frisian
gl Galician
ka Georgian
de German
el Greek
gu Gujarati
ht Haitian Creole
ha Hausa
haw Hawaiian
iw Hebrew
hi Hindi
hmn Hmong
hu Hungarian
is Icelandic
ig Igbo
id Indonesian
ga Irish
it Italian
ja Japanese
jw Javanese
kn Kannada
kk Kazakh
km Khmer
ko Korean
ku Kurdish (Kurmanji)
ky Kyrgyz
lo Lao
la Latin
lv Latvian
lt Lithuanian
lb Luxembourgish
mk Macedonian
mg Malagasy
ms Malay
ml Malayalam
mt Maltese
mi Maori
mr Marathi
mn Mongolian
my Myanmar (Burmese)
ne Nepali
no Norwegian
ps Pashto
fa Persian Download
pl Polish
pt Portuguese
pa Punjabi
ro Romanian
ru Russian
sm Samoan
gd Scots Gaelic
sr Serbian
st Sesotho
sn Shona
sd Sindhi
si Sinhala
sk Slovak
sl Slovenian
so Somali
es Spanish
su Sundanese
sw Swahili
sv Swedish
tg Tajik
ta Tamil
te Telugu
th Thai
tr Turkish
uk Ukrainian
ur Urdu
uz Uzbek
vi Vietnamese
cy Welsh
xh Xhosa
yi Yiddish
yo Yoruba
zu Zulu
or Odia (Oriya)
rw Kinyarwanda
tk Turkmen
tt Tatar
ug Uyghur
Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 00:00:01,280 --> 00:00:08,390 Hello, and a very warm welcome to the data scientist course to many people, data science and 2 00:00:08,390 --> 00:00:14,600 machine learning feel like unknown territory, something too abstract and complex to show you. 3 00:00:14,600 --> 00:00:15,860 That isn't the case. 4 00:00:15,890 --> 00:00:21,680 I would like to start by giving you a hands on example of what to expect from this course and 5 00:00:21,680 --> 00:00:25,160 what you will be able to do on your own only a few hours into our training. 6 00:00:26,110 --> 00:00:32,560 There are three key words that will come up a lot during the lecture data algorithm and insight. 7 00:00:33,500 --> 00:00:40,520 Here we have data collected from the customers of a shop, 30 observations in total, each observation 8 00:00:40,520 --> 00:00:45,030 represents a client who shared their customer satisfaction and brand loyalty. 9 00:00:45,650 --> 00:00:51,680 Let's suppose the owners of the shop hired our consultancy firm to analyze customer behavior. 10 00:00:52,820 --> 00:00:58,490 Dividing the shop's customer base into groups of individuals with similar traits is a great way to reduce complexity 11 00:00:58,490 --> 00:01:03,630 and come up with ideas on how to serve these customer groups better. 12 00:01:03,680 --> 00:01:06,680 And, of course, when their business in the long run. 13 00:01:07,730 --> 00:01:10,880 To do that, we will have to apply machine learning. 14 00:01:11,950 --> 00:01:12,490 Ready? 15 00:01:13,000 --> 00:01:13,670 Here we go. 16 00:01:14,110 --> 00:01:18,250 The data set that we've got is already loaded in the variable data. 17 00:01:18,790 --> 00:01:24,160 A good preliminary step of most analyses is to visualize the data and examine it. 18 00:01:24,640 --> 00:01:27,550 One of the better tools to do that is a scatterplot. 19 00:01:28,710 --> 00:01:31,020 How many groups of points can you see here? 20 00:01:32,040 --> 00:01:38,670 There are two groups standing out in data science, we would normally call these groups clusters 21 00:01:38,670 --> 00:01:43,070 so two clusters can be identified instantly with no machine learning whatsoever. 22 00:01:43,620 --> 00:01:50,400 One represents people with low loyalty and low satisfaction and the other one containing all the rest. 23 00:01:51,650 --> 00:01:57,740 Our preliminary visual examination shows us that there are some insights we can draw for sure, but 24 00:01:57,920 --> 00:02:00,410 let's take a more scientific approach. 25 00:02:01,490 --> 00:02:06,140 Most of the times in data science, you would want to standardize your data. 26 00:02:07,760 --> 00:02:14,810 Next, we will perform some unsupervised machine learning, more specifically cluster analysis using 27 00:02:14,810 --> 00:02:19,190 the popular K means algorithm, we will identify four clusters. 28 00:02:19,670 --> 00:02:24,320 The code, which we will examine in detail later on in the course, looks like the following. 29 00:02:25,460 --> 00:02:26,780 And we are done. 30 00:02:27,690 --> 00:02:32,820 I can now plot the data using the predicted clusters as colors of the new scatterplot. 31 00:02:34,220 --> 00:02:41,510 We've got the same scatterplot, but with four clusters, our customers have been segmented from here, 32 00:02:41,660 --> 00:02:45,380 we can distinguish four types of customers and actually name them. 33 00:02:46,100 --> 00:02:53,090 The ones with the low satisfaction and low loyalty will be called alienated, those with high satisfaction 34 00:02:53,090 --> 00:02:54,140 and high loyalty. 35 00:02:54,260 --> 00:03:01,850 Our fans, those with low satisfaction and high loyalty are supporters and the last ones that are neutral 36 00:03:01,880 --> 00:03:04,310 or disloyal but have a high satisfaction. 37 00:03:04,520 --> 00:03:05,990 These are roamers. 38 00:03:07,060 --> 00:03:13,120 Using just a few lines of code, we've reached a remarkable result we have segmented our customers in 39 00:03:13,120 --> 00:03:14,500 four different groups. 40 00:03:14,710 --> 00:03:18,610 We've applied an algorithm on our data to reach an insight. 41 00:03:19,750 --> 00:03:27,120 Naturally, we must analyze what we see data science is about storytelling and making sense of numbers. 42 00:03:27,760 --> 00:03:30,580 We have four groups, but only one of them is favorable. 43 00:03:30,760 --> 00:03:34,690 The fans cluster analysis indicates the problem. 44 00:03:35,170 --> 00:03:38,860 Some customers are dissatisfied, others are disloyal. 45 00:03:39,160 --> 00:03:43,300 However, we must figure out how to solve the problem ourselves. 46 00:03:44,580 --> 00:03:48,130 What are some ideas a data scientist and management will come up with? 47 00:03:48,660 --> 00:03:54,430 It makes sense to focus our efforts to turn supporters into fans by improving their shopping experience. 48 00:03:55,080 --> 00:04:00,890 Normally, we would have to dig deeper to find the drivers of dissatisfaction for these customers. 49 00:04:01,470 --> 00:04:06,840 Maybe it is long queues or unfriendly staff or perhaps high prices. 50 00:04:07,300 --> 00:04:13,320 Whatever the reason, we must take actionable steps to fix the issue and make our supporters happier. 51 00:04:14,790 --> 00:04:17,680 Simultaneously, we can do something else. 52 00:04:18,120 --> 00:04:22,560 We can turn the roamers into fans by increasing their brand loyalty. 53 00:04:23,570 --> 00:04:29,540 Loyalty cards, gifts, personalized discounts, vouchers and raffles are different strategies used 54 00:04:29,540 --> 00:04:32,150 to make such clients loyal in the long run. 55 00:04:32,750 --> 00:04:33,260 Great. 56 00:04:34,540 --> 00:04:40,780 Please bear in mind that in this exercise, we missed a few steps along the way, typing code step by 57 00:04:40,780 --> 00:04:46,510 step, creating a program, analyzing a heat map and finding the optimal number of clusters. 58 00:04:47,080 --> 00:04:51,180 However, these are all topics we will address later on in the course. 59 00:04:51,910 --> 00:04:58,060 So let's begin acquiring the knowledge needed step by step until we are ready to gain insights from 60 00:04:58,060 --> 00:05:05,080 larger data sets with various algorithms so we can turn all types of data into actionable insights. 6319

Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.