subtitlecat.com

All language subtitles for 1. A Practical Example What You Will Learn in This Course

Afrikaans

Albanian

Amharic

Arabic Download

Armenian

Azerbaijani

Basque

Belarusian

Bengali

Bosnian

Bulgarian

Catalan

Cebuano

Chichewa

Chinese (Simplified)

Chinese (Traditional)

Corsican

Croatian

Czech

Danish

Dutch

English

Esperanto

Estonian

Filipino

Finnish

French

Frisian

Galician

Georgian

German

Greek

Gujarati

Haitian Creole

Hausa

Hawaiian

Hebrew

Hindi

Hmong

Hungarian

Icelandic

Igbo

Indonesian

Irish

Italian

Japanese

Javanese

Kannada

Kazakh

Khmer

Korean

Kurdish (Kurmanji)

Kyrgyz

Lao

Latin

Latvian

Lithuanian

Luxembourgish

Macedonian

Malagasy

Malay

Malayalam

Maltese

Maori

Marathi

Mongolian

Myanmar (Burmese)

Nepali

Norwegian

Pashto

Persian

Polish

Portuguese

Punjabi

Romanian

Russian

Samoan

Scots Gaelic

Serbian

Sesotho

Shona

Sindhi

Sinhala

Slovak

Slovenian

Somali

Spanish

Sundanese

Swahili

Swedish

Tajik

Tamil

Telugu

Thai

Turkish

Ukrainian

Urdu

Uzbek

Vietnamese

Welsh

Xhosa

Yiddish

Yoruba

Zulu

Odia (Oriya)

Kinyarwanda

Turkmen

Tatar

Uyghur

Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 00:00:01,350 --> 00:00:08,370 Hello and a very warm welcome to the data scientist course too many people data science and machine 2 00:00:08,370 --> 00:00:15,920 learning feel like unknown territory something too abstract and complex to show you that isn't the case. 3 00:00:15,990 --> 00:00:21,660 I would like to start by giving you a hands on example of what to expect from this course and what you 4 00:00:21,660 --> 00:00:23,210 will be able to do on your own. 5 00:00:23,280 --> 00:00:29,710 Only a few hours into our training there are three key words that will come up a lot during the lecture 6 00:00:29,920 --> 00:00:34,010 data algorithm and insight here. 7 00:00:34,190 --> 00:00:41,150 We have data collected from the customers of a shop 30 observations in total each observation represents 8 00:00:41,150 --> 00:00:45,730 a client who shared their customer satisfaction and brand loyalty. 9 00:00:45,740 --> 00:00:53,330 Let's suppose the owners of the shop hired our consultancy firm to analyze customer behavior dividing 10 00:00:53,330 --> 00:00:59,270 the shops customer base into groups of individuals with similar traits is a great way to reduce complexity 11 00:00:59,390 --> 00:01:03,690 and come up with ideas on how to serve these customer groups better. 12 00:01:03,770 --> 00:01:09,860 And of course when their business in the long run to do that we will have to apply. 13 00:01:09,890 --> 00:01:13,010 Machine learning ready. 14 00:01:13,060 --> 00:01:13,980 Here we go. 15 00:01:14,170 --> 00:01:18,850 The dataset that we've got is already loaded in the variable data. 16 00:01:18,850 --> 00:01:24,440 A good preliminary step of most analyses is to visualize the data and examine it. 17 00:01:24,700 --> 00:01:31,250 One of the better tools to do that is a scatterplot how many groups of points can you see here. 18 00:01:32,080 --> 00:01:35,610 There are two groups standing out in data science. 19 00:01:35,680 --> 00:01:38,120 We would normally call these groups clusters. 20 00:01:38,260 --> 00:01:45,130 So two clusters can be identified instantly with no machine learning whatsoever one represents people 21 00:01:45,130 --> 00:01:50,550 with low loyalty and low satisfaction and the other one containing all the rest. 22 00:01:51,730 --> 00:01:57,170 Our preliminary visual examination shows us that there are some insights we can draw for sure. 23 00:01:57,430 --> 00:02:00,610 But let's take a more scientific approach. 24 00:02:01,550 --> 00:02:09,410 Most of the times in data science you would want to standardize your data next we will perform some 25 00:02:09,500 --> 00:02:16,880 unsupervised machine learning more specifically cluster analysis using the popular Kamins algorithm 26 00:02:17,090 --> 00:02:19,720 we will identify for clusters. 27 00:02:19,730 --> 00:02:24,470 The code which we'll examine in detail later on in the course looks like the following. 28 00:02:25,580 --> 00:02:27,690 And we are done. 29 00:02:27,810 --> 00:02:34,850 I can now plot the data using the predicted clusters as colors of the new scatterplot we've got the 30 00:02:34,850 --> 00:02:38,690 same scatterplot but with four clusters. 31 00:02:38,690 --> 00:02:40,880 Our customers have been segmented. 32 00:02:40,880 --> 00:02:47,900 From here we can distinguish four types of customers and actually name them the ones with the low satisfaction 33 00:02:47,900 --> 00:02:56,060 and low loyalty will be called alienated those with high satisfaction and high loyalty are fans. 34 00:02:56,060 --> 00:02:59,920 Those with low satisfaction and high loyalty are supporters. 35 00:03:00,140 --> 00:03:04,460 And the last ones that are neutral or disloyal but have a high satisfaction. 36 00:03:04,580 --> 00:03:09,040 These are roamers using just a few lines of code. 37 00:03:09,040 --> 00:03:11,470 We've reached a remarkable result. 38 00:03:11,470 --> 00:03:14,770 We have segmented our customers in four different groups. 39 00:03:14,800 --> 00:03:19,590 We've applied an algorithm on our data to reach an insight. 40 00:03:19,840 --> 00:03:23,590 Naturally we must analyze what we see data. 41 00:03:23,590 --> 00:03:27,820 Science is about storytelling and making sense of numbers. 42 00:03:27,820 --> 00:03:30,820 We have four groups but only one of them is favorable. 43 00:03:30,820 --> 00:03:37,450 The fans cluster analysis indicates the problem some customers are dissatisfied. 44 00:03:37,540 --> 00:03:39,210 Others are disloyal. 45 00:03:39,220 --> 00:03:44,660 However we must figure out how to solve the problem ourselves. 46 00:03:44,670 --> 00:03:48,740 What are some ideas a data scientist and management will come up with. 47 00:03:48,780 --> 00:03:55,130 It makes sense to focus our efforts to turn supporters into fans by improving their shopping experience. 48 00:03:55,140 --> 00:04:01,510 Normally we would have to dig deeper to find the drivers of dissatisfaction for these customers. 49 00:04:01,530 --> 00:04:07,230 Maybe it is long queues or unfriendly staff or perhaps high prices. 50 00:04:07,380 --> 00:04:13,630 Whatever the reason we must take actionable steps to fix the issue and make our supporters happier. 51 00:04:14,880 --> 00:04:17,980 Simultaneously we can do something else. 52 00:04:18,180 --> 00:04:25,760 We can turn the roamers into fans by increasing their brand loyalty loyalty cards gifts personalized 53 00:04:25,760 --> 00:04:31,790 discounts vouchers and raffles are different strategies used to make such clients loyal in the long 54 00:04:31,790 --> 00:04:34,590 run great. 55 00:04:34,600 --> 00:04:39,360 Please bear in mind that in this exercise we missed a few steps along the way. 56 00:04:39,430 --> 00:04:45,700 Typing code step by step creating a DeNiro Graham analyzing a heat map and finding the optimal number 57 00:04:45,700 --> 00:04:47,100 of clusters. 58 00:04:47,140 --> 00:04:51,530 However these are all topics we will address later on in the course. 59 00:04:52,000 --> 00:04:58,060 So let's begin acquiring the knowledge needed step by step until we are ready to gain insights from 60 00:04:58,060 --> 00:05:05,410 larger data sets with various algorithms so we can turn all types of data into actionable insights.6518