All language subtitles for 6. Factor and Categorical Matrices

af Afrikaans
sq Albanian
am Amharic
ar Arabic
hy Armenian
az Azerbaijani
eu Basque
be Belarusian
bn Bengali
bs Bosnian
bg Bulgarian
ca Catalan
ceb Cebuano
ny Chichewa
zh-CN Chinese (Simplified)
zh-TW Chinese (Traditional)
co Corsican
hr Croatian
cs Czech
da Danish
nl Dutch
en English
eo Esperanto
et Estonian
tl Filipino
fi Finnish
fr French
fy Frisian
gl Galician
ka Georgian
de German
el Greek
gu Gujarati
ht Haitian Creole
ha Hausa
haw Hawaiian
iw Hebrew
hi Hindi
hmn Hmong
hu Hungarian
is Icelandic
ig Igbo
id Indonesian Download
ga Irish
it Italian
ja Japanese
jw Javanese
kn Kannada
kk Kazakh
km Khmer
ko Korean
ku Kurdish (Kurmanji)
ky Kyrgyz
lo Lao
la Latin
lv Latvian
lt Lithuanian
lb Luxembourgish
mk Macedonian
mg Malagasy
ms Malay
ml Malayalam
mt Maltese
mi Maori
mr Marathi
mn Mongolian
my Myanmar (Burmese)
ne Nepali
no Norwegian
ps Pashto
fa Persian
pl Polish
pt Portuguese
pa Punjabi
ro Romanian
ru Russian
sm Samoan
gd Scots Gaelic
sr Serbian
st Sesotho
sn Shona
sd Sindhi
si Sinhala
sk Slovak
sl Slovenian
so Somali
es Spanish
su Sundanese
sw Swahili
sv Swedish
tg Tajik
ta Tamil
te Telugu
th Thai
tr Turkish
uk Ukrainian
ur Urdu
uz Uzbek
vi Vietnamese
cy Welsh
xh Xhosa
yi Yiddish
yo Yoruba
zu Zulu
or Odia (Oriya)
rw Kinyarwanda
tk Turkmen
tt Tatar
ug Uyghur
Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 00:00:00,420 --> 00:00:06,230 Hello everyone and welcome to the lecture on factor in categorical matrices this lecture we're going 2 00:00:06,230 --> 00:00:09,850 to look at how to create factor in categorical matrices in our. 3 00:00:10,050 --> 00:00:13,140 Let's go ahead and jump to our studio to get started. 4 00:00:13,140 --> 00:00:15,300 All right so here here our studio. 5 00:00:15,480 --> 00:00:20,520 And in this lecture we're mainly going to be discussing the factor function and it's used for creating 6 00:00:20,520 --> 00:00:22,210 categorical matrices. 7 00:00:22,470 --> 00:00:27,510 This specific function will become really useful when we begin to apply data analysis and machine learning 8 00:00:27,510 --> 00:00:29,210 techniques to our data. 9 00:00:29,220 --> 00:00:32,160 This is also sometimes and then is creating dummy variables. 10 00:00:32,160 --> 00:00:35,630 So you may be familiar with it in that context. 11 00:00:35,670 --> 00:00:40,230 So let's go to start by show an example of why and how we can build a sort of matrix. 12 00:00:40,390 --> 00:00:50,580 I want to go ahead and create a vector called Animal Nalut just go ahead and pretend that where have 13 00:00:50,580 --> 00:00:52,120 some sort of animal sanctuary. 14 00:00:52,130 --> 00:00:54,040 And there's dogs and cats in it. 15 00:00:54,090 --> 00:00:59,090 So we'll go ahead and just label this the C for cat D for dog. 16 00:00:59,340 --> 00:01:07,180 Let's go ahead and put another dog in there and then we'll start with two more cats. 17 00:01:07,180 --> 00:01:07,870 All right. 18 00:01:08,160 --> 00:01:11,690 So we have our animal vector where C is for cat D is for dog. 19 00:01:11,700 --> 00:01:16,890 And so we're trying to represent some sort of animal sanctuary and let's say each of these specific 20 00:01:16,890 --> 00:01:20,320 animals also has a corresponding ID number. 21 00:01:20,430 --> 00:01:23,440 So we'll go ahead and make an ID number vector. 22 00:01:23,970 --> 00:01:29,130 Know let's just go ahead and say One two three four five. 23 00:01:29,130 --> 00:01:34,440 All right so we have our animals in Adie's and we want to convert the animal vector into information 24 00:01:34,440 --> 00:01:40,260 that an algorithm or equation can understand more easily meaning we want to begin to check how many 25 00:01:40,260 --> 00:01:45,030 categories or factor levels are actually in our character vector. 26 00:01:45,330 --> 00:01:51,600 So we can do is we can pass the vector through the factor function like this. 27 00:01:51,600 --> 00:01:54,230 So we're going to go in and say factor. 28 00:01:55,080 --> 00:02:00,470 And so then notice I'm going to go ahead and pass an animal. 29 00:02:00,810 --> 00:02:07,980 And so what you get back is this vector that also notice that it has these levels attached to it C and 30 00:02:07,980 --> 00:02:08,790 D. 31 00:02:08,850 --> 00:02:16,470 So I'm going to go ahead and assign this factor of animal and I'll assign it to let's just go ahead 32 00:02:16,470 --> 00:02:22,180 and say fact Annie as some variable name for this. 33 00:02:22,320 --> 00:02:24,580 So we're going to save that factor. 34 00:02:24,690 --> 00:02:25,250 All right. 35 00:02:25,410 --> 00:02:29,340 So we can see that we have two levels D-N.C. for dog and cat. 36 00:02:29,610 --> 00:02:34,310 So in our There's going to be two distinct types of categorical variables. 37 00:02:34,380 --> 00:02:38,670 There's going to be ordinal categorical variables in nominal categorical variables. 38 00:02:38,790 --> 00:02:42,780 So the nominal categorical variables let me go in and type this out. 39 00:02:42,780 --> 00:02:44,540 You can also reference the notes for this. 40 00:02:44,770 --> 00:02:50,310 So right now we're talking about nominal must go ahead and make that a little brighter by not putting 41 00:02:50,310 --> 00:02:51,660 it as a comment. 42 00:02:51,660 --> 00:02:55,230 So the nominal categorical variables don't have any order. 43 00:02:55,350 --> 00:02:57,180 So you can imagine that dogs and cats. 44 00:02:57,180 --> 00:02:59,480 There's no order to them. 45 00:02:59,520 --> 00:03:06,570 Well some people may prefer dogs for cats but in reality there's no order to them versus ordinal categorical 46 00:03:06,570 --> 00:03:08,370 variables as the name implies. 47 00:03:08,400 --> 00:03:18,420 Do have some sort of order so we can think of nominal there's no order versus let's go ahead and say 48 00:03:20,160 --> 00:03:24,660 ordinal there is an order. 49 00:03:24,750 --> 00:03:27,690 So nominal an example that is what we have here. 50 00:03:27,840 --> 00:03:34,770 This animal so there's no order versus dogs and cats but something that could be ordinal of order could 51 00:03:34,770 --> 00:03:44,970 be something like temperatures so we could have something like or Dohm cat with let's say cold 52 00:03:48,300 --> 00:03:51,890 medium and hot. 53 00:03:51,960 --> 00:03:59,700 So this is a ordinal categorical variable because even though we're dealing with strings here and different 54 00:03:59,700 --> 00:04:01,080 categories. 55 00:04:01,080 --> 00:04:05,910 So cold is a category medium is a category of temperature hot as category temperature. 56 00:04:05,910 --> 00:04:07,620 You can see that there's an order to them. 57 00:04:07,650 --> 00:04:11,280 It should be cold medium hot versus dogs and cats. 58 00:04:11,310 --> 00:04:14,690 So Dog is a category C is a category for cat. 59 00:04:14,850 --> 00:04:19,470 There's no order of dogs before cats are cast before dogs. 60 00:04:19,470 --> 00:04:20,150 OK. 61 00:04:20,640 --> 00:04:27,120 So if you actually want to assign an order to these variables you would be able to do that while using 62 00:04:27,120 --> 00:04:28,250 the factor function. 63 00:04:28,290 --> 00:04:34,680 And you can actually pass in an argument with ordered equals true and pass in the levels argument as 64 00:04:34,680 --> 00:04:34,990 well. 65 00:04:34,990 --> 00:04:37,510 So we're going to show you how we can do that. 66 00:04:37,560 --> 00:04:47,610 So let's say we have a vector temp's and it's going to be called Medium a high bigger than just copy 67 00:04:47,610 --> 00:04:49,170 and paste this. 68 00:04:49,290 --> 00:04:53,780 And we're also going to put in a couple of more instances so I'll say there's another instance of hot 69 00:04:53,830 --> 00:04:54,320 . 70 00:04:54,690 --> 00:04:58,320 Another instance of Ha we'll say there's another axis of cold. 71 00:04:58,590 --> 00:05:03,440 And finally one more instance of median. 72 00:05:03,820 --> 00:05:04,050 Right. 73 00:05:04,050 --> 00:05:06,820 So we have our attempts vector here. 74 00:05:06,840 --> 00:05:10,520 So it's caught cold medium hot ha hot cold medium. 75 00:05:10,540 --> 00:05:10,740 Right. 76 00:05:10,740 --> 00:05:12,670 So we have these three categories. 77 00:05:12,690 --> 00:05:23,340 Let's go ahead and pass this into a variable using factor and let me just bring this back up by clearing 78 00:05:23,340 --> 00:05:29,110 that I'm going to go ahead and say temp's is that vector of cold medium hot et cetera. 79 00:05:29,130 --> 00:05:39,780 Various instances and I'll say the factored version of this called The Factor function pass in my temps 80 00:05:39,860 --> 00:05:40,850 vector. 81 00:05:41,280 --> 00:05:45,750 I'm going to say ordered is equal to true. 82 00:05:45,780 --> 00:05:50,310 Remember I didn't do this for the dogs and cats vector because there was no order that made sense for 83 00:05:50,310 --> 00:05:52,150 dogs and cats in this case. 84 00:05:52,170 --> 00:05:53,730 There is an order that makes sense. 85 00:05:53,730 --> 00:05:55,830 It should be cold medium that hot. 86 00:05:55,950 --> 00:05:59,610 You could also do hot medium cold depending on which way you want it to think about it. 87 00:05:59,640 --> 00:06:07,050 This case will say cold comes first then medium than hot so I'll go ahead and add in the level's argument 88 00:06:07,560 --> 00:06:14,130 and the levels argument is going to take in a vector and you just pasan a vector of the order you want 89 00:06:15,030 --> 00:06:20,040 needs to be ordered it's going to say cold medium and then hot 90 00:06:23,000 --> 00:06:23,320 right. 91 00:06:23,370 --> 00:06:26,160 So we've assigned that factored version of that. 92 00:06:26,370 --> 00:06:32,420 Dr. chanels go ahead and see an idea of what this looks like by calling fact that temp. 93 00:06:32,460 --> 00:06:38,010 So notice the levels actually have an order to them and you can see it by the comparison operator in 94 00:06:38,010 --> 00:06:42,570 between the terms so cold is less than medium and medium is less than hot. 95 00:06:42,960 --> 00:06:48,750 OK so that's pretty useful and it's going to be really useful when we end up using the summary function 96 00:06:49,230 --> 00:06:53,540 to call a summary on that object. 97 00:06:54,660 --> 00:07:00,720 So I'll go ahead and tell us cold medium and hot and the various counts of how many times those appear 98 00:07:01,290 --> 00:07:08,550 and we can show you a summary of the original temps as well to show the difference between the factored 99 00:07:09,030 --> 00:07:13,920 version of that vector versus the summary of the original. 100 00:07:13,920 --> 00:07:14,610 All right. 101 00:07:14,610 --> 00:07:16,860 So that's it for this lecture. 102 00:07:16,890 --> 00:07:22,320 Don't stress out too much about understanding everything you've gone over here during this lecture. 103 00:07:22,350 --> 00:07:26,160 We're not going to touch back on this until quite a while from now. 104 00:07:26,310 --> 00:07:31,900 I just wanted you to be aware of it since it fits in well with the ideas of vectors and matrices. 105 00:07:31,920 --> 00:07:39,210 So the only thing you should be aware of are the use of the factor function as a possibility for you 106 00:07:39,210 --> 00:07:39,930 to use. 107 00:07:40,080 --> 00:07:47,890 And then the difference between a nominal category set versus an ordinal category set. 108 00:07:47,910 --> 00:07:53,520 OK so the nominal you have those cats and dogs are the order ordinal as the name implies there is an 109 00:07:53,550 --> 00:07:54,480 order to it. 110 00:07:54,480 --> 00:08:00,750 And you can specify that order using the factor function by saying order equals true and then specifying 111 00:08:00,780 --> 00:08:08,230 the levels as a vector and you just pasand the vector of the terms in indexed order that you want. 112 00:08:08,510 --> 00:08:10,920 OK that's it for this lecture. 113 00:08:10,920 --> 00:08:11,960 Thanks everyone. 114 00:08:12,000 --> 00:08:13,550 I'll see at the next lecture. 11673

Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.