subtitlecat.com

All language subtitles for 012 A Practical Example of Bayesian Inference_en

Afrikaans

Akan

Albanian

Amharic

Arabic

Armenian

Azerbaijani

Basque

Belarusian

Bemba

Bengali

Bihari

Bosnian

Breton

Bulgarian

Cambodian

Catalan

Cebuano

Cherokee

Chichewa

Chinese (Simplified)

Chinese (Traditional)

Corsican

Croatian

Czech

Danish

Dutch

English

Esperanto

Estonian

Ewe

Faroese

Filipino

Finnish

French

Frisian

Galician

Georgian

German

Greek

Guarani

Gujarati

Haitian Creole

Hausa

Hawaiian

Hebrew

Hindi

Hmong

Hungarian

Icelandic

Igbo

Indonesian

Interlingua

Irish

Italian

Japanese

Javanese

Kannada

Kazakh

Kinyarwanda

Kirundi

Kongo

Korean

Krio (Sierra Leone)

Kurdish

Kurdish (Soranî)

Kyrgyz

Laothian

Latin

Latvian

Lingala

Lithuanian

Lozi

Luganda

Luo

Luxembourgish

Macedonian

Malagasy

Malay

Malayalam

Maltese

Maori

Marathi

Mauritian Creole

Moldavian

Mongolian

Myanmar (Burmese)

Montenegrin

Nepali

Nigerian Pidgin

Northern Sotho

Norwegian

Norwegian (Nynorsk)

Occitan

Oriya

Oromo

Pashto

Persian Download

Polish

Portuguese (Brazil)

Portuguese (Portugal)

Punjabi

Quechua

Romanian

Romansh

Runyakitara

Russian

Samoan

Scots Gaelic

Serbian

Serbo-Croatian

Sesotho

Setswana

Seychellois Creole

Shona

Sindhi

Sinhalese

Slovak

Slovenian

Somali

Spanish

Spanish (Latin American)

Sundanese

Swahili

Swedish

Tajik

Tamil

Tatar

Telugu

Thai

Tigrinya

Tonga

Tshiluba

Tumbuka

Turkish

Turkmen

Twi

Uighur

Ukrainian

Urdu

Uzbek

Vietnamese

Welsh

Wolof

Xhosa

Yiddish

Yoruba

Zulu

Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 00:00:03,060 --> 00:00:05,880 Instructor: Welcome to yet another practical example. 2 00:00:05,880 --> 00:00:07,080 To make sure you understand 3 00:00:07,080 --> 00:00:09,750 how all of these new terms are applied in practice, 4 00:00:09,750 --> 00:00:12,390 we are going to go over some real-world data. 5 00:00:12,390 --> 00:00:15,510 In particular, we will examine student population statistics 6 00:00:15,510 --> 00:00:20,070 for Hamilton College during the 2017-18 academic year. 7 00:00:20,070 --> 00:00:23,040 We will apply Bayes' Law to determine whether the college 8 00:00:23,040 --> 00:00:26,400 is successfully diversifying its student population. 9 00:00:26,400 --> 00:00:28,950 Since the college only provides a four year program, 10 00:00:28,950 --> 00:00:31,350 any representation higher than 25% 11 00:00:31,350 --> 00:00:33,500 would indicate a higher than average value. 12 00:00:34,800 --> 00:00:37,140 Okay, to conduct our analysis 13 00:00:37,140 --> 00:00:39,090 we are going to examine the Common Data Set 14 00:00:39,090 --> 00:00:43,440 for Hamilton College for the 2017-18 academic year. 15 00:00:43,440 --> 00:00:45,929 The Common Data Set, or CDS for short, 16 00:00:45,929 --> 00:00:47,850 is a free public data set 17 00:00:47,850 --> 00:00:50,340 of summarized statistics about the demographic 18 00:00:50,340 --> 00:00:52,440 of the students attending a given college. 19 00:00:53,790 --> 00:00:54,960 A new data set is released 20 00:00:54,960 --> 00:00:57,930 after the start of every new academic year. 21 00:00:57,930 --> 00:01:00,450 Furthermore, these sets are available for public use 22 00:01:00,450 --> 00:01:03,000 and can be found either through the College Board website 23 00:01:03,000 --> 00:01:06,390 or the site of the specific college you're interested in. 24 00:01:06,390 --> 00:01:07,260 In this instance, 25 00:01:07,260 --> 00:01:09,810 we visited Hamilton College's official webpage 26 00:01:09,810 --> 00:01:12,573 and typed in CDS into the search bar. 27 00:01:13,830 --> 00:01:15,090 Out of the results of our search 28 00:01:15,090 --> 00:01:18,663 we chose the one for the 2017-18 academic year. 29 00:01:20,460 --> 00:01:21,293 Great. 30 00:01:22,170 --> 00:01:23,790 Now that you understand the data, 31 00:01:23,790 --> 00:01:25,050 we can start off by opening 32 00:01:25,050 --> 00:01:30,050 the CDS 2017-2018 PDF file attached to this lecture. 33 00:01:32,460 --> 00:01:36,240 As you can see, section A is titled General Information 34 00:01:36,240 --> 00:01:38,640 and includes basic information about the college. 35 00:01:38,640 --> 00:01:40,050 Like contact details 36 00:01:40,050 --> 00:01:42,930 and whether it is co-educational or not. 37 00:01:42,930 --> 00:01:45,240 This information is important in general 38 00:01:45,240 --> 00:01:47,190 but it is not relevant to our goal, 39 00:01:47,190 --> 00:01:50,853 so we skip it and jump to section B on the third page. 40 00:01:52,350 --> 00:01:55,500 This section is titled Enrollment and Persistence 41 00:01:55,500 --> 00:01:57,540 and it showcases enrollment in the college 42 00:01:57,540 --> 00:01:59,970 divided into different categories. 43 00:01:59,970 --> 00:02:01,970 Now, that's something we'll make use of. 44 00:02:02,940 --> 00:02:05,430 Let's start off by examining Table B1 45 00:02:05,430 --> 00:02:07,620 which showcases the distribution of students 46 00:02:07,620 --> 00:02:09,300 based on their gender. 47 00:02:09,300 --> 00:02:11,310 Since the survey students fill out to apply 48 00:02:11,310 --> 00:02:13,320 only has the options male and female, 49 00:02:13,320 --> 00:02:15,993 the data is split solely into these two groups. 50 00:02:17,130 --> 00:02:18,510 Now, let's spend some time ensuring 51 00:02:18,510 --> 00:02:20,100 you interpret the table correctly 52 00:02:20,100 --> 00:02:21,870 because we will be examining similar ones 53 00:02:21,870 --> 00:02:23,020 throughout the lecture. 54 00:02:23,970 --> 00:02:24,990 Start off by examining 55 00:02:24,990 --> 00:02:27,630 the four different columns in the table. 56 00:02:27,630 --> 00:02:30,060 They represent full-time and part-time students 57 00:02:30,060 --> 00:02:31,890 separated by gender. 58 00:02:31,890 --> 00:02:32,940 More precisely, 59 00:02:32,940 --> 00:02:35,850 the first column showcases full-time male students 60 00:02:35,850 --> 00:02:39,243 and the second one showcases full-time female students. 61 00:02:40,620 --> 00:02:42,120 The third and fourth column 62 00:02:42,120 --> 00:02:43,530 represent part-time male 63 00:02:43,530 --> 00:02:45,680 and part-time female students, accordingly. 64 00:02:47,460 --> 00:02:49,500 Now let us look at the rows. 65 00:02:49,500 --> 00:02:52,800 We see that they're split into two major groups as well, 66 00:02:52,800 --> 00:02:55,053 Undergraduates and Graduates. 67 00:02:56,520 --> 00:02:58,800 Since Hamilton is a liberal arts college, 68 00:02:58,800 --> 00:02:59,760 it is is an institution 69 00:02:59,760 --> 00:03:02,040 which only offers bachelor's degrees. 70 00:03:02,040 --> 00:03:04,290 There are not graduate students. 71 00:03:04,290 --> 00:03:06,150 This explains why no numeric values 72 00:03:06,150 --> 00:03:08,850 feature in the lower half of the table. 73 00:03:08,850 --> 00:03:10,680 Furthermore, all claims we make 74 00:03:10,680 --> 00:03:13,514 will be about undergraduate students. 75 00:03:13,514 --> 00:03:16,770 Therefore, the number 217 in the table 76 00:03:16,770 --> 00:03:19,710 is part of the first column and the first row. 77 00:03:19,710 --> 00:03:21,330 Therefore, the people included 78 00:03:21,330 --> 00:03:23,460 are simultaneously full-time males 79 00:03:23,460 --> 00:03:26,460 and degree seeking first time freshmen. 80 00:03:26,460 --> 00:03:27,810 In a Bayesian sense, 81 00:03:27,810 --> 00:03:31,050 these 217 students represent the intersection 82 00:03:31,050 --> 00:03:34,533 of the sets denoted in the first row and the first column. 83 00:03:36,510 --> 00:03:38,793 Let's observe the total undergraduates row. 84 00:03:40,500 --> 00:03:42,090 It includes all degree seeking 85 00:03:42,090 --> 00:03:44,590 and all other students enrolled in credit courses. 86 00:03:45,720 --> 00:03:48,330 This means that the value 1,000 we observe 87 00:03:48,330 --> 00:03:50,460 in the sixth row of the second column, 88 00:03:50,460 --> 00:03:53,219 is the union of all full-time undergraduate women 89 00:03:53,219 --> 00:03:54,483 in the college. 90 00:03:55,320 --> 00:03:57,210 Since there are no graduate students, 91 00:03:57,210 --> 00:03:59,190 1,000 is actually the union 92 00:03:59,190 --> 00:04:02,103 of all the full-time women in the entire college. 93 00:04:03,990 --> 00:04:05,700 Now, the union of all women, 94 00:04:05,700 --> 00:04:07,290 both full-time and part-time, 95 00:04:07,290 --> 00:04:11,520 would be 1,000 plus nine or 1,009. 96 00:04:11,520 --> 00:04:12,353 This is true 97 00:04:12,353 --> 00:04:14,310 because no students can be simultaneously 98 00:04:14,310 --> 00:04:16,529 part-time and full-time. 99 00:04:16,529 --> 00:04:18,839 A Bayesian way of expressing this relationship 100 00:04:18,839 --> 00:04:21,180 would be to say that the sets of part-time 101 00:04:21,180 --> 00:04:24,153 and full-time students are mutually exclusive. 102 00:04:25,710 --> 00:04:30,710 Additionally, we have 886 plus two, or 888 male students, 103 00:04:31,620 --> 00:04:33,420 attending the college. 104 00:04:33,420 --> 00:04:34,253 Below the table, 105 00:04:34,253 --> 00:04:36,810 we are provided with the total number of enrolled students, 106 00:04:36,810 --> 00:04:39,393 which is 1,897. 107 00:04:40,260 --> 00:04:44,460 Since there are 1,009 women and 888 men, 108 00:04:44,460 --> 00:04:46,653 they combine to complete the sample space. 109 00:04:47,820 --> 00:04:49,950 Furthermore, due to the nature of the survey, 110 00:04:49,950 --> 00:04:52,200 nobody was allowed to mark any answer different 111 00:04:52,200 --> 00:04:53,850 from male or female, 112 00:04:53,850 --> 00:04:56,970 so the two sets have no overlapping elements. 113 00:04:56,970 --> 00:04:58,320 Satisfying both conditions 114 00:04:58,320 --> 00:05:00,603 means the two sets are compliments. 115 00:05:01,530 --> 00:05:03,660 All right, now that you're well acquainted 116 00:05:03,660 --> 00:05:05,100 with how to read these tables, 117 00:05:05,100 --> 00:05:08,010 we will move on to a different part of section B. 118 00:05:08,010 --> 00:05:11,490 Namely, we will focus on the Racial/Ethnic diversity 119 00:05:11,490 --> 00:05:14,163 in the college summarized in table B2 below. 120 00:05:17,040 --> 00:05:18,300 We wanna use Bayes' Law 121 00:05:18,300 --> 00:05:20,100 to determine whether the student body 122 00:05:20,100 --> 00:05:23,220 is successfully diversifying its population. 123 00:05:23,220 --> 00:05:24,960 Therefore, we need to use the table 124 00:05:24,960 --> 00:05:26,850 to determine whether the freshman class 125 00:05:26,850 --> 00:05:30,483 is more diverse than the average for the specific ethnicity. 126 00:05:31,350 --> 00:05:33,900 To do so, we need to be able to accurately compute 127 00:05:33,900 --> 00:05:36,723 the appropriate size of each set. we are interested in. 128 00:05:37,830 --> 00:05:38,880 Let event A 129 00:05:38,880 --> 00:05:42,570 be for a Degree-Seeking First-Time First-Year student, 130 00:05:42,570 --> 00:05:47,490 and event B being Black or African American, non-Hispanic. 131 00:05:47,490 --> 00:05:50,160 For convenience, we are going to refer to elements of A 132 00:05:50,160 --> 00:05:53,823 simply as First-Years and elements of B as Black. 133 00:05:55,230 --> 00:05:58,500 Since we have a total of 1,897 students, 134 00:05:58,500 --> 00:06:01,110 and 480 of them are First-Years, 135 00:06:01,110 --> 00:06:03,000 then the probability of being a freshman 136 00:06:03,000 --> 00:06:07,313 equals 480 over 1,897 or 0.253. 137 00:06:09,510 --> 00:06:12,780 This suggests that approximately 25.3% 138 00:06:12,780 --> 00:06:14,613 of the student body are freshmen. 139 00:06:16,170 --> 00:06:18,450 Similarly, we can estimate the probability 140 00:06:18,450 --> 00:06:19,410 of a random student 141 00:06:19,410 --> 00:06:22,920 at the institution being of African American descent. 142 00:06:22,920 --> 00:06:27,920 That would equal 80 over 1,897 or 0.042, or close to 4.2%. 143 00:06:32,373 --> 00:06:33,900 Now, the intersection of A and B 144 00:06:33,900 --> 00:06:37,230 would represent all Black first year students. 145 00:06:37,230 --> 00:06:38,730 Going back to the table, 146 00:06:38,730 --> 00:06:42,870 only 26 students represent both demographics. 147 00:06:42,870 --> 00:06:45,390 The probability of being a Black, First-Year student 148 00:06:45,390 --> 00:06:50,390 is 26 over 1,897 or 0.014, which is close to 1.4%. 149 00:06:56,370 --> 00:06:58,830 We know the likelihood of a student being African American 150 00:06:58,830 --> 00:06:59,880 and we know the chance 151 00:06:59,880 --> 00:07:03,510 of a random student being both Black and a freshman, 152 00:07:03,510 --> 00:07:06,090 thus, we can use the conditional probability 153 00:07:06,090 --> 00:07:08,430 to see that the likelihood of a Black student 154 00:07:08,430 --> 00:07:10,320 being in his first year at the college 155 00:07:10,320 --> 00:07:14,493 is 26 over 80, or 0.325. 156 00:07:15,390 --> 00:07:17,340 This value is significantly greater 157 00:07:17,340 --> 00:07:20,010 than the expected average of 0.25, 158 00:07:20,010 --> 00:07:21,510 so we can see a rising trend 159 00:07:21,510 --> 00:07:24,753 in the representation of minority in the student population. 160 00:07:26,100 --> 00:07:28,470 The union of A and B represents all students 161 00:07:28,470 --> 00:07:31,980 who are either First-Time, First-Years or Black. 162 00:07:31,980 --> 00:07:34,500 We know that there are 481 First-Years. 163 00:07:34,500 --> 00:07:38,460 80 Black and 26 First-Year Black students. 164 00:07:38,460 --> 00:07:42,060 To find the number of students within the union of A and B, 165 00:07:42,060 --> 00:07:43,683 we would apply the Additive Law. 166 00:07:45,750 --> 00:07:47,280 According to the Additive Rule, 167 00:07:47,280 --> 00:07:52,280 we would have 480 plus 80 minus 26, or 534 students, 168 00:07:53,490 --> 00:07:56,130 that are either freshmen or Black. 169 00:07:56,130 --> 00:07:58,050 Once again, we would find the probability 170 00:07:58,050 --> 00:08:00,780 of being part of the union by dividing the size 171 00:08:00,780 --> 00:08:03,543 of the union by the size of the sample space. 172 00:08:04,380 --> 00:08:09,167 In this instance, that would be 534 over 1,897, or 0.281, 173 00:08:12,360 --> 00:08:16,500 which indicates that approximately 28.1% of the student body 174 00:08:16,500 --> 00:08:19,323 is either a freshman or identifies as Black. 175 00:08:20,700 --> 00:08:22,590 So far so good. 176 00:08:22,590 --> 00:08:25,470 Now, suppose C represents the set 177 00:08:25,470 --> 00:08:28,023 of all Hispanic/Latino students at the college. 178 00:08:28,920 --> 00:08:32,190 Since event B clearly says non-Hispanic, 179 00:08:32,190 --> 00:08:35,039 then the two must be mutually exclusive. 180 00:08:35,039 --> 00:08:38,400 Thus, the intersection of the two is the empty set 181 00:08:38,400 --> 00:08:41,370 but their union equals the sum of their elements. 182 00:08:41,370 --> 00:08:43,049 Therefore, according to the table, 183 00:08:43,049 --> 00:08:48,050 there must be 167 plus 80, or 247 students, 184 00:08:48,480 --> 00:08:51,843 who identify as either African American or Latino. 185 00:08:53,100 --> 00:08:55,200 The probability of picking a random student 186 00:08:55,200 --> 00:08:57,360 and them identifying as either one, 187 00:08:57,360 --> 00:09:02,360 equals 247 over 1,897, or 0.13, which equals 13%. 188 00:09:07,110 --> 00:09:08,670 Not that great as a percentage, 189 00:09:08,670 --> 00:09:10,473 but great work on figuring that out. 190 00:09:11,640 --> 00:09:12,960 You could surely find out these 191 00:09:12,960 --> 00:09:14,670 in other relationships on your own. 192 00:09:14,670 --> 00:09:16,500 However, let's dig a bit deeper 193 00:09:16,500 --> 00:09:18,650 and examine some conditional probabilities. 194 00:09:19,680 --> 00:09:22,200 In table B2, the entire first column 195 00:09:22,200 --> 00:09:26,370 only represents values for First-Year students. 196 00:09:26,370 --> 00:09:27,930 Therefore, any number we get 197 00:09:27,930 --> 00:09:29,790 would represent the size of the intersection 198 00:09:29,790 --> 00:09:32,910 of freshmen and another demographic. 199 00:09:32,910 --> 00:09:34,680 This is important when we wish to compute 200 00:09:34,680 --> 00:09:36,123 conditional probabilities. 201 00:09:37,950 --> 00:09:40,110 Recall that the conditional probability formula 202 00:09:40,110 --> 00:09:42,300 states that the likelihood of an event occurring 203 00:09:42,300 --> 00:09:44,490 given another event has already occurred, 204 00:09:44,490 --> 00:09:46,920 equals the likelihood of the intersection 205 00:09:46,920 --> 00:09:49,293 over the likelihood of the second event. 206 00:09:50,160 --> 00:09:53,070 A more precise example would be the following. 207 00:09:53,070 --> 00:09:54,690 The likelihood of being Black, 208 00:09:54,690 --> 00:09:56,340 given you are a freshman, 209 00:09:56,340 --> 00:09:59,520 equals the probability of being a Black freshman 210 00:09:59,520 --> 00:10:01,563 over the likelihood of being a freshman. 211 00:10:02,550 --> 00:10:05,880 We can simplify this to the size of the intersection 212 00:10:05,880 --> 00:10:07,743 over the size of the second set. 213 00:10:08,850 --> 00:10:13,850 In our example, that would mean 26 over 480, or 0.054. 214 00:10:14,460 --> 00:10:17,730 Therefore, there is a roughly 5.4% chance 215 00:10:17,730 --> 00:10:21,000 for any freshman student to identify as Black. 216 00:10:21,000 --> 00:10:22,830 Similarly, we can compute the likelihood 217 00:10:22,830 --> 00:10:25,173 of a given student to be Hispanic, First-Year. 218 00:10:26,280 --> 00:10:28,590 We can compute the likelihood of being a Latino, 219 00:10:28,590 --> 00:10:30,510 given you are in your first year of college 220 00:10:30,510 --> 00:10:32,700 as well as the likelihood of being a freshman 221 00:10:32,700 --> 00:10:34,533 and apply the multiplication rule. 222 00:10:35,850 --> 00:10:36,683 Let's begin. 223 00:10:38,580 --> 00:10:40,978 We start by examining events A and C 224 00:10:40,978 --> 00:10:44,430 being a First-Year and being Latino. 225 00:10:44,430 --> 00:10:45,840 The likelihood of being Latino, 226 00:10:45,840 --> 00:10:47,550 given you are a First-Year, 227 00:10:47,550 --> 00:10:50,671 equals the number of Latino students who are First-Years 228 00:10:50,671 --> 00:10:52,473 over all First-Years. 229 00:10:53,340 --> 00:10:54,480 According to the table, 230 00:10:54,480 --> 00:10:59,480 that equals 57 over 480, or 0.119, which is close to 12%. 231 00:11:02,460 --> 00:11:04,110 So far so good. 232 00:11:04,110 --> 00:11:06,150 Now that we've computed both probabilities, 233 00:11:06,150 --> 00:11:09,300 we can plug the values into the multiplication rule. 234 00:11:09,300 --> 00:11:13,200 The probability of being a freshman is 0.253 235 00:11:13,200 --> 00:11:14,940 and the probability of being Latino, 236 00:11:14,940 --> 00:11:18,900 assuming you are a First-Year, is 0.119. 237 00:11:18,900 --> 00:11:22,290 By multiplying the two, we get 0.03, 238 00:11:22,290 --> 00:11:25,803 or a 3% likelihood of being a Latino First-Year. 239 00:11:27,014 --> 00:11:28,860 Great job. 240 00:11:28,860 --> 00:11:30,450 What if we wanna find out the likelihood 241 00:11:30,450 --> 00:11:34,200 of being a freshman, given you are Hispanic? 242 00:11:34,200 --> 00:11:37,320 We could calculate this using two different ways. 243 00:11:37,320 --> 00:11:38,280 In the first one, 244 00:11:38,280 --> 00:11:41,010 we would simply apply the conditional probability formula 245 00:11:41,010 --> 00:11:42,330 like we did earlier. 246 00:11:42,330 --> 00:11:45,843 However, we could also apply Bayes' Law to solve this. 247 00:11:46,710 --> 00:11:47,970 According to the theorem, 248 00:11:47,970 --> 00:11:49,680 the likelihood of being a freshman, 249 00:11:49,680 --> 00:11:51,570 assuming you are Hispanic, 250 00:11:51,570 --> 00:11:53,790 equals the likelihood of being Latino, 251 00:11:53,790 --> 00:11:55,320 given you are a First-Year, 252 00:11:55,320 --> 00:11:57,810 times the probability of being a freshman 253 00:11:57,810 --> 00:12:00,930 over the probability of being Hispanic. 254 00:12:00,930 --> 00:12:03,840 Next, we estimate the likelihood of being Latino, 255 00:12:03,840 --> 00:12:08,840 which is 167 over 1,897, or 0.089, and that is close to 9%. 256 00:12:13,230 --> 00:12:16,350 We have estimated all three of the required probabilities 257 00:12:16,350 --> 00:12:20,333 and they are respectively equal to 0.119,0 .253, and 0.089. 258 00:12:24,600 --> 00:12:26,100 Plugging these values into the formula 259 00:12:26,100 --> 00:12:31,100 gives us 0.199 times 0.253 divided by 0.089, or 0.338. 260 00:12:37,530 --> 00:12:40,260 That means there is a 33% chance 261 00:12:40,260 --> 00:12:44,400 a student is a First-Year assuming they are Hispanic. 262 00:12:44,400 --> 00:12:46,620 Thus, we can say that a person is more likely 263 00:12:46,620 --> 00:12:49,020 to be a First-Year, given they are Hispanic, 264 00:12:49,020 --> 00:12:52,560 than to be Hispanic, given they are a freshman. 265 00:12:52,560 --> 00:12:55,050 If we think about the favored overall formula, 266 00:12:55,050 --> 00:12:57,660 this makes sense because there are more freshmen 267 00:12:57,660 --> 00:13:00,030 than Hispanic students in the college. 268 00:13:00,030 --> 00:13:02,130 Such a characteristic is fairly common 269 00:13:02,130 --> 00:13:04,920 among small liberal arts colleges in upstate New York, 270 00:13:04,920 --> 00:13:06,963 so the insight does not surprise us. 271 00:13:09,390 --> 00:13:10,473 Phenomenal work. 272 00:13:11,310 --> 00:13:13,740 We examined several tables from the Common Data Set 273 00:13:13,740 --> 00:13:17,280 for Hamilton College for the 2017-18 academic year 274 00:13:17,280 --> 00:13:19,080 and our short analysis suggests 275 00:13:19,080 --> 00:13:22,080 that the college is improving its minority representation 276 00:13:22,080 --> 00:13:24,030 with the current freshman class. 277 00:13:24,030 --> 00:13:26,340 However, further research would be required 278 00:13:26,340 --> 00:13:28,820 to account for attrition among the student population 279 00:13:28,820 --> 00:13:32,340 as well as moving to other colleges within the region. 280 00:13:32,340 --> 00:13:33,330 Even though our analysis 281 00:13:33,330 --> 00:13:35,250 may not have been full or conclusive, 282 00:13:35,250 --> 00:13:38,670 we made full use of our understanding of Bayesian notation 283 00:13:38,670 --> 00:13:41,190 to reach some insight about the data. 284 00:13:41,190 --> 00:13:43,590 This shows how important Bayesian inference is 285 00:13:43,590 --> 00:13:44,880 in terms of analytics 286 00:13:44,880 --> 00:13:46,680 and how understanding the relationship 287 00:13:46,680 --> 00:13:48,120 between sets and events 288 00:13:48,120 --> 00:13:50,760 can help us reach important conclusions. 289 00:13:50,760 --> 00:13:54,060 For homework, you can explore section C of the CDS 290 00:13:54,060 --> 00:13:57,690 which summarizes First-Time, First-Year admissions. 291 00:13:57,690 --> 00:14:00,420 Compute what the likelihood of a First-Time male student 292 00:14:00,420 --> 00:14:02,580 to be accepted, based on gender, 293 00:14:02,580 --> 00:14:04,680 and determine whether being male or female 294 00:14:04,680 --> 00:14:07,113 had any effect on your chances of acceptance. 295 00:14:08,040 --> 00:14:10,950 Furthermore, determine whether First-Time freshman women 296 00:14:10,950 --> 00:14:12,270 were more likely to enroll 297 00:14:12,270 --> 00:14:13,863 than first time freshman men. 298 00:14:14,700 --> 00:14:16,110 To find both of these, 299 00:14:16,110 --> 00:14:19,713 you need to examine only the tables in part C1 of the CDS. 300 00:14:21,390 --> 00:14:23,550 Additionally, you can practice your understanding 301 00:14:23,550 --> 00:14:25,380 of probabilities by exploring the values 302 00:14:25,380 --> 00:14:27,990 in table C2 and determining the likelihood 303 00:14:27,990 --> 00:14:30,720 of being offered a place on the wait list. 304 00:14:30,720 --> 00:14:33,660 Additionally, you can compute the chance of being admitted 305 00:14:33,660 --> 00:14:35,460 having accepted a place on the wait list 306 00:14:35,460 --> 00:14:37,350 and the likelihood of getting admitted 307 00:14:37,350 --> 00:14:39,753 given you are offered a place on the wait list. 308 00:14:40,800 --> 00:14:41,790 In the next section, 309 00:14:41,790 --> 00:14:43,020 we are going to start talking about 310 00:14:43,020 --> 00:14:44,700 probability distributions, 311 00:14:44,700 --> 00:14:45,990 how to properly apply them 312 00:14:45,990 --> 00:14:48,480 and why understanding the most commonly featured ones 313 00:14:48,480 --> 00:14:50,250 is so important. 314 00:14:50,250 --> 00:14:51,363 Thanks for watching. 24877