All language subtitles for 1. A Practical Example What You Will Learn in This Course

af Afrikaans
sq Albanian
am Amharic
ar Arabic Download
hy Armenian
az Azerbaijani
eu Basque
be Belarusian
bn Bengali
bs Bosnian
bg Bulgarian
ca Catalan
ceb Cebuano
ny Chichewa
zh-CN Chinese (Simplified)
zh-TW Chinese (Traditional)
co Corsican
hr Croatian
cs Czech
da Danish
nl Dutch
en English
eo Esperanto
et Estonian
tl Filipino
fi Finnish
fr French
fy Frisian
gl Galician
ka Georgian
de German
el Greek
gu Gujarati
ht Haitian Creole
ha Hausa
haw Hawaiian
iw Hebrew
hi Hindi
hmn Hmong
hu Hungarian
is Icelandic
ig Igbo
id Indonesian
ga Irish
it Italian
ja Japanese
jw Javanese
kn Kannada
kk Kazakh
km Khmer
ko Korean
ku Kurdish (Kurmanji)
ky Kyrgyz
lo Lao
la Latin
lv Latvian
lt Lithuanian
lb Luxembourgish
mk Macedonian
mg Malagasy
ms Malay
ml Malayalam
mt Maltese
mi Maori
mr Marathi
mn Mongolian
my Myanmar (Burmese)
ne Nepali
no Norwegian
ps Pashto
fa Persian
pl Polish
pt Portuguese
pa Punjabi
ro Romanian
ru Russian
sm Samoan
gd Scots Gaelic
sr Serbian
st Sesotho
sn Shona
sd Sindhi
si Sinhala
sk Slovak
sl Slovenian
so Somali
es Spanish
su Sundanese
sw Swahili
sv Swedish
tg Tajik
ta Tamil
te Telugu
th Thai
tr Turkish
uk Ukrainian
ur Urdu
uz Uzbek
vi Vietnamese
cy Welsh
xh Xhosa
yi Yiddish
yo Yoruba
zu Zulu
or Odia (Oriya)
rw Kinyarwanda
tk Turkmen
tt Tatar
ug Uyghur
Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 00:00:01,350 --> 00:00:08,370 Hello and a very warm welcome to the data scientist course too many people data science and machine 2 00:00:08,370 --> 00:00:15,920 learning feel like unknown territory something too abstract and complex to show you that isn't the case. 3 00:00:15,990 --> 00:00:21,660 I would like to start by giving you a hands on example of what to expect from this course and what you 4 00:00:21,660 --> 00:00:23,210 will be able to do on your own. 5 00:00:23,280 --> 00:00:29,710 Only a few hours into our training there are three key words that will come up a lot during the lecture 6 00:00:29,920 --> 00:00:34,010 data algorithm and insight here. 7 00:00:34,190 --> 00:00:41,150 We have data collected from the customers of a shop 30 observations in total each observation represents 8 00:00:41,150 --> 00:00:45,730 a client who shared their customer satisfaction and brand loyalty. 9 00:00:45,740 --> 00:00:53,330 Let's suppose the owners of the shop hired our consultancy firm to analyze customer behavior dividing 10 00:00:53,330 --> 00:00:59,270 the shops customer base into groups of individuals with similar traits is a great way to reduce complexity 11 00:00:59,390 --> 00:01:03,690 and come up with ideas on how to serve these customer groups better. 12 00:01:03,770 --> 00:01:09,860 And of course when their business in the long run to do that we will have to apply. 13 00:01:09,890 --> 00:01:13,010 Machine learning ready. 14 00:01:13,060 --> 00:01:13,980 Here we go. 15 00:01:14,170 --> 00:01:18,850 The dataset that we've got is already loaded in the variable data. 16 00:01:18,850 --> 00:01:24,440 A good preliminary step of most analyses is to visualize the data and examine it. 17 00:01:24,700 --> 00:01:31,250 One of the better tools to do that is a scatterplot how many groups of points can you see here. 18 00:01:32,080 --> 00:01:35,610 There are two groups standing out in data science. 19 00:01:35,680 --> 00:01:38,120 We would normally call these groups clusters. 20 00:01:38,260 --> 00:01:45,130 So two clusters can be identified instantly with no machine learning whatsoever one represents people 21 00:01:45,130 --> 00:01:50,550 with low loyalty and low satisfaction and the other one containing all the rest. 22 00:01:51,730 --> 00:01:57,170 Our preliminary visual examination shows us that there are some insights we can draw for sure. 23 00:01:57,430 --> 00:02:00,610 But let's take a more scientific approach. 24 00:02:01,550 --> 00:02:09,410 Most of the times in data science you would want to standardize your data next we will perform some 25 00:02:09,500 --> 00:02:16,880 unsupervised machine learning more specifically cluster analysis using the popular Kamins algorithm 26 00:02:17,090 --> 00:02:19,720 we will identify for clusters. 27 00:02:19,730 --> 00:02:24,470 The code which we'll examine in detail later on in the course looks like the following. 28 00:02:25,580 --> 00:02:27,690 And we are done. 29 00:02:27,810 --> 00:02:34,850 I can now plot the data using the predicted clusters as colors of the new scatterplot we've got the 30 00:02:34,850 --> 00:02:38,690 same scatterplot but with four clusters. 31 00:02:38,690 --> 00:02:40,880 Our customers have been segmented. 32 00:02:40,880 --> 00:02:47,900 From here we can distinguish four types of customers and actually name them the ones with the low satisfaction 33 00:02:47,900 --> 00:02:56,060 and low loyalty will be called alienated those with high satisfaction and high loyalty are fans. 34 00:02:56,060 --> 00:02:59,920 Those with low satisfaction and high loyalty are supporters. 35 00:03:00,140 --> 00:03:04,460 And the last ones that are neutral or disloyal but have a high satisfaction. 36 00:03:04,580 --> 00:03:09,040 These are roamers using just a few lines of code. 37 00:03:09,040 --> 00:03:11,470 We've reached a remarkable result. 38 00:03:11,470 --> 00:03:14,770 We have segmented our customers in four different groups. 39 00:03:14,800 --> 00:03:19,590 We've applied an algorithm on our data to reach an insight. 40 00:03:19,840 --> 00:03:23,590 Naturally we must analyze what we see data. 41 00:03:23,590 --> 00:03:27,820 Science is about storytelling and making sense of numbers. 42 00:03:27,820 --> 00:03:30,820 We have four groups but only one of them is favorable. 43 00:03:30,820 --> 00:03:37,450 The fans cluster analysis indicates the problem some customers are dissatisfied. 44 00:03:37,540 --> 00:03:39,210 Others are disloyal. 45 00:03:39,220 --> 00:03:44,660 However we must figure out how to solve the problem ourselves. 46 00:03:44,670 --> 00:03:48,740 What are some ideas a data scientist and management will come up with. 47 00:03:48,780 --> 00:03:55,130 It makes sense to focus our efforts to turn supporters into fans by improving their shopping experience. 48 00:03:55,140 --> 00:04:01,510 Normally we would have to dig deeper to find the drivers of dissatisfaction for these customers. 49 00:04:01,530 --> 00:04:07,230 Maybe it is long queues or unfriendly staff or perhaps high prices. 50 00:04:07,380 --> 00:04:13,630 Whatever the reason we must take actionable steps to fix the issue and make our supporters happier. 51 00:04:14,880 --> 00:04:17,980 Simultaneously we can do something else. 52 00:04:18,180 --> 00:04:25,760 We can turn the roamers into fans by increasing their brand loyalty loyalty cards gifts personalized 53 00:04:25,760 --> 00:04:31,790 discounts vouchers and raffles are different strategies used to make such clients loyal in the long 54 00:04:31,790 --> 00:04:34,590 run great. 55 00:04:34,600 --> 00:04:39,360 Please bear in mind that in this exercise we missed a few steps along the way. 56 00:04:39,430 --> 00:04:45,700 Typing code step by step creating a DeNiro Graham analyzing a heat map and finding the optimal number 57 00:04:45,700 --> 00:04:47,100 of clusters. 58 00:04:47,140 --> 00:04:51,530 However these are all topics we will address later on in the course. 59 00:04:52,000 --> 00:04:58,060 So let's begin acquiring the knowledge needed step by step until we are ready to gain insights from 60 00:04:58,060 --> 00:05:05,410 larger data sets with various algorithms so we can turn all types of data into actionable insights.6518

Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.