All language subtitles for 05_choropleth-maps.en

af Afrikaans
sq Albanian
am Amharic
ar Arabic Download
hy Armenian
az Azerbaijani
eu Basque
be Belarusian
bn Bengali
bs Bosnian
bg Bulgarian
ca Catalan
ceb Cebuano
ny Chichewa
zh-CN Chinese (Simplified)
zh-TW Chinese (Traditional)
co Corsican
hr Croatian
cs Czech
da Danish
nl Dutch
en English
eo Esperanto
et Estonian
tl Filipino
fi Finnish
fr French
fy Frisian
gl Galician
ka Georgian
de German
el Greek
gu Gujarati
ht Haitian Creole
ha Hausa
haw Hawaiian
iw Hebrew
hi Hindi
hmn Hmong
hu Hungarian
is Icelandic
ig Igbo
id Indonesian
ga Irish
it Italian
ja Japanese
jw Javanese
kn Kannada
kk Kazakh
km Khmer
ko Korean
ku Kurdish (Kurmanji)
ky Kyrgyz
lo Lao
la Latin
lv Latvian
lt Lithuanian
lb Luxembourgish
mk Macedonian
mg Malagasy
ms Malay
ml Malayalam
mt Maltese
mi Maori
mr Marathi
mn Mongolian
my Myanmar (Burmese)
ne Nepali
no Norwegian
ps Pashto
fa Persian
pl Polish
pt Portuguese
pa Punjabi
ro Romanian
ru Russian
sm Samoan
gd Scots Gaelic
sr Serbian
st Sesotho
sn Shona
sd Sindhi
si Sinhala
sk Slovak
sl Slovenian
so Somali
es Spanish
su Sundanese
sw Swahili
sv Swedish
tg Tajik
ta Tamil
te Telugu
th Thai
tr Turkish
uk Ukrainian
ur Urdu
uz Uzbek
vi Vietnamese
cy Welsh
xh Xhosa
yi Yiddish
yo Yoruba
zu Zulu
or Odia (Oriya)
rw Kinyarwanda
tk Turkmen
tt Tatar
ug Uyghur
Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 00:00:00,000 --> 00:00:04,470 Choropleth maps are one of the most popular and commonly used map types out there. 2 00:00:04,470 --> 00:00:06,425 So, let's have a look at how they work. 3 00:00:06,425 --> 00:00:12,830 So, the word Choropleth was coined by a cartographer named John Kirtland Wright in 1938. 4 00:00:12,830 --> 00:00:16,485 He was trying to come up with a word to describe a combination of 5 00:00:16,485 --> 00:00:20,830 assigning values to different parts of a map or different spaces. 6 00:00:20,830 --> 00:00:23,760 So, he went to the Greek origins of this, 7 00:00:23,760 --> 00:00:27,750 which are choros for space and pleth for value. 8 00:00:27,750 --> 00:00:31,010 So, he combined those to create this new word called choropleth. 9 00:00:31,010 --> 00:00:33,280 So, that was a word that this guy invented. 10 00:00:33,280 --> 00:00:34,620 So, why am I telling you this? 11 00:00:34,620 --> 00:00:36,810 Why have I not mentioned the origin of words in 12 00:00:36,810 --> 00:00:39,725 a lot of different sections before in other videos? 13 00:00:39,725 --> 00:00:44,690 Well, because there's a common thing that people really want to do, 14 00:00:44,690 --> 00:00:48,220 which is to put an 'l' in there and call it a chloropleth map. 15 00:00:48,220 --> 00:00:50,780 It's just one of those little things that kind of bugs me, 16 00:00:50,780 --> 00:00:52,670 is that it's not a chloropleth map, 17 00:00:52,670 --> 00:00:55,790 it's not related to chlorophyll or chloroform 18 00:00:55,790 --> 00:00:59,420 or chloro this or chloro that, it's choropleth. 19 00:00:59,420 --> 00:01:02,300 I always think it's important especially when you're starting out and learning 20 00:01:02,300 --> 00:01:05,270 about these things to get the terminology correct from the beginning, that way, 21 00:01:05,270 --> 00:01:08,045 you look like you know what you're talking about and you're not making these kind of 22 00:01:08,045 --> 00:01:12,670 weird mistakes that to any trained GIS cartography type person, 23 00:01:12,670 --> 00:01:16,980 they'll notice that right away is if you mispronounce it as chloropleth. 24 00:01:16,980 --> 00:01:18,110 I think I've made my point. 25 00:01:18,110 --> 00:01:21,410 I just wanted to make sure that was clear and so, you won't make that mistake. 26 00:01:21,410 --> 00:01:23,460 You'll probably think of this and go, "Yeah, 27 00:01:23,460 --> 00:01:26,660 I'll never say that now," which is mission accomplished then. 28 00:01:26,660 --> 00:01:29,660 Okay. So, the whole idea of a choropleth map is that you 29 00:01:29,660 --> 00:01:32,825 have numbers that are assigned to areas. 30 00:01:32,825 --> 00:01:34,250 They could be anything, 31 00:01:34,250 --> 00:01:37,640 but probably the best place to start or one of the most common ways of 32 00:01:37,640 --> 00:01:41,060 using them or for things that we would call enumeration areas. 33 00:01:41,060 --> 00:01:42,200 So, think of it, 34 00:01:42,200 --> 00:01:45,040 like if you've counted the number of people for a census unit, 35 00:01:45,040 --> 00:01:47,570 something like that in a neighborhood or a word 36 00:01:47,570 --> 00:01:51,290 or a congressional district or whatever it happens to be, 37 00:01:51,290 --> 00:01:55,100 you've got a number that you've assigned to a particular area. 38 00:01:55,100 --> 00:01:59,480 So, here we've got some population counts for different census tracts, 39 00:01:59,480 --> 00:02:02,790 and so we have one number per area. 40 00:02:02,790 --> 00:02:05,080 Now, on its own, if you're just looking at that, 41 00:02:05,080 --> 00:02:08,890 if I quickly asked you to say or I asked you what's the highest value? 42 00:02:08,890 --> 00:02:11,555 What's the lowest? Is there a pattern going on here? 43 00:02:11,555 --> 00:02:15,110 Is there a gradation from low to high from east to west? Something like that. 44 00:02:15,110 --> 00:02:16,660 If you're just looking at the numbers, 45 00:02:16,660 --> 00:02:19,835 it's really not that easy to see what's going on. 46 00:02:19,835 --> 00:02:22,980 So, what we do is we tend to, 47 00:02:22,980 --> 00:02:24,400 this is normally how it would be done, 48 00:02:24,400 --> 00:02:28,610 is grouped those values together into classes and then 49 00:02:28,610 --> 00:02:33,530 assign each of those classes either a gradation from black to white, 50 00:02:33,530 --> 00:02:37,895 like a gray scale or gradation of some kind of color, like I've done here. 51 00:02:37,895 --> 00:02:41,030 So now, we have five different classes of 52 00:02:41,030 --> 00:02:45,590 values and if we assign those to the numbers in our dataset, 53 00:02:45,590 --> 00:02:47,265 we can assign it, 54 00:02:47,265 --> 00:02:51,500 as I'm saying here, an intensity of color or shade that's proportional to those values, 55 00:02:51,500 --> 00:02:54,350 then you end up with a choropleth map, 56 00:02:54,350 --> 00:02:57,440 where it's much easier to visualize those actual numbers. 57 00:02:57,440 --> 00:03:00,845 I could put labels on there as well if you wanted to, 58 00:03:00,845 --> 00:03:02,115 but the idea is generally, 59 00:03:02,115 --> 00:03:05,420 that do you want someone to be able to look at that and very easily be able to 60 00:03:05,420 --> 00:03:09,470 see which areas have higher population values and which ones have lower values. 61 00:03:09,470 --> 00:03:12,740 Now, it's not super great to just use population counts, 62 00:03:12,740 --> 00:03:14,060 it's better to use something like density, 63 00:03:14,060 --> 00:03:16,135 but I'll get to that a little bit later. 64 00:03:16,135 --> 00:03:19,820 So, if I apply exactly the same idea to my entire data set here. 65 00:03:19,820 --> 00:03:22,045 So, these are all the census tracts for Toronto, 66 00:03:22,045 --> 00:03:24,830 here are all the population counts for each of 67 00:03:24,830 --> 00:03:29,040 the census tracts and that's exactly the same thing if I said, "So, what's going on here, 68 00:03:29,040 --> 00:03:31,010 where are the high areas, low areas, 69 00:03:31,010 --> 00:03:32,960 is there some kind of pattern going on, 70 00:03:32,960 --> 00:03:35,970 can we see similarities and differences and relationships, things like that." 71 00:03:35,970 --> 00:03:38,060 Of course, it's really difficult to do that, 72 00:03:38,060 --> 00:03:39,950 but if we classify the data, 73 00:03:39,950 --> 00:03:42,470 in this case we can do this in ArcMap with 74 00:03:42,470 --> 00:03:48,650 the Symbology tab here and I've specified the value as being population. 75 00:03:48,650 --> 00:03:50,660 I'm going to use five classes. 76 00:03:50,660 --> 00:03:52,850 So, I've divided up into five, doesn't have to be five. 77 00:03:52,850 --> 00:03:54,170 It's not some magic number, 78 00:03:54,170 --> 00:03:55,715 it could be three or seven. 79 00:03:55,715 --> 00:03:58,730 I'm using a classification method here called quintile. 80 00:03:58,730 --> 00:04:02,270 We'll explain that a little bit more later and so now, 81 00:04:02,270 --> 00:04:06,280 I'm using a color ramp here from this lights, what would you call it? 82 00:04:06,280 --> 00:04:10,640 Kind of a magenta to darker and so that is being used to 83 00:04:10,640 --> 00:04:17,669 assign a range of that color scheme according to the color ramp to these classes. 84 00:04:17,669 --> 00:04:22,640 So, that's how this idea of a choropleth map is actually implemented in the software, 85 00:04:22,640 --> 00:04:24,500 that's exactly how you would set it up if you are 86 00:04:24,500 --> 00:04:26,810 going in and doing that yourself, you've got a data set, 87 00:04:26,810 --> 00:04:28,640 you tell it which attribute to use, in other words, 88 00:04:28,640 --> 00:04:30,455 which column in your table, 89 00:04:30,455 --> 00:04:32,280 and then you tell how many classes you want, 90 00:04:32,280 --> 00:04:34,760 you tell it how to divide up the numbers into those classes, 91 00:04:34,760 --> 00:04:38,225 there's different ways to do that and then you tell it what color scheme to use. 92 00:04:38,225 --> 00:04:43,785 Boom, it puts it all together and you end up with a choropleth map like this. 93 00:04:43,785 --> 00:04:46,535 So, now just like I did before, 94 00:04:46,535 --> 00:04:48,235 I've got my different classes, 95 00:04:48,235 --> 00:04:53,160 I've got a gradational values that indicate to somebody low to high. 96 00:04:53,160 --> 00:04:54,300 So, someone looks at this, 97 00:04:54,300 --> 00:04:57,110 it's very easy for them to be able to look at any part of the map and see 98 00:04:57,110 --> 00:05:00,960 what areas are lower population versus higher population. 99 00:05:00,960 --> 00:05:03,835 Now, I have a question for you, 100 00:05:03,835 --> 00:05:07,075 is this a useful map? 101 00:05:07,075 --> 00:05:09,530 I want you to think about it for a second and this 102 00:05:09,530 --> 00:05:11,070 is a common thing with choropleth maps. 103 00:05:11,070 --> 00:05:13,700 It's something that it's important that I think anybody who is teaching this wants to 104 00:05:13,700 --> 00:05:16,820 make sure it comes across well is that, 105 00:05:16,820 --> 00:05:21,890 look at the size of the areas that are being mapped and what's being mapped here. 106 00:05:21,890 --> 00:05:26,270 So, we're mapping people that live in a city and we're using 107 00:05:26,270 --> 00:05:31,120 different sized areas to count up how many people are in those areas. 108 00:05:31,120 --> 00:05:36,850 So, it makes sense or I'm hoping that you're seeing that if you have a big area, 109 00:05:36,850 --> 00:05:40,090 the odds are pretty good that there's going to be more people in that area, 110 00:05:40,090 --> 00:05:41,930 and so they'll be a higher value, 111 00:05:41,930 --> 00:05:44,365 and if you have a smaller area, 112 00:05:44,365 --> 00:05:47,690 there's less people, the odds are that 113 00:05:47,690 --> 00:05:50,300 it's likely that there's less people at that location, 114 00:05:50,300 --> 00:05:52,070 and so you're going to have a lower value. 115 00:05:52,070 --> 00:05:55,325 So, in other words it's not the most useful way of 116 00:05:55,325 --> 00:05:58,910 portraying what's going on here because it's bias by area. 117 00:05:58,910 --> 00:06:02,330 In other words, we want to be able to control for that area or take it out 118 00:06:02,330 --> 00:06:06,505 or normalize it or somehow count for that, 119 00:06:06,505 --> 00:06:08,030 so that when we're making a map, 120 00:06:08,030 --> 00:06:12,620 we can show something that's more true to what's really going on, 121 00:06:12,620 --> 00:06:14,720 in this case what would be a better way of doing 122 00:06:14,720 --> 00:06:17,690 this is taking out area or normalizing for it and the way we do 123 00:06:17,690 --> 00:06:24,115 that is to divide the populations by area to create a population density. 124 00:06:24,115 --> 00:06:26,940 If I go back to my symbology, 125 00:06:26,940 --> 00:06:29,920 I have the option of using this thing here called 126 00:06:29,920 --> 00:06:34,150 normalization and you can select what field you want to use for that. 127 00:06:34,150 --> 00:06:36,750 So, what's happening here is I've got population, 128 00:06:36,750 --> 00:06:38,450 let me just try that again. 129 00:06:38,450 --> 00:06:42,240 I'm going to go, population and I'm going to normalize that by area. 130 00:06:42,240 --> 00:06:43,640 All that means is, 131 00:06:43,640 --> 00:06:47,410 that I'm asking the software to take my population column 132 00:06:47,410 --> 00:06:51,940 divided by my area column and that's going to calculate on the fly, 133 00:06:51,940 --> 00:06:53,720 so to speak, what the density, 134 00:06:53,720 --> 00:06:59,000 the population density values are and use those in my choropleth maps. 135 00:06:59,000 --> 00:07:01,180 So, I'm still using the same color scheme, 136 00:07:01,180 --> 00:07:02,710 I'm still using five classes, 137 00:07:02,710 --> 00:07:04,320 I'm still using quantiles. 138 00:07:04,320 --> 00:07:06,990 But now, I'm going to be representing population density. 139 00:07:06,990 --> 00:07:09,710 I'm taking out that bias that's being introduced 140 00:07:09,710 --> 00:07:13,320 by area and see what happens in terms of my result. 141 00:07:15,210 --> 00:07:18,380 So, here's what we get. We have a map that has 142 00:07:18,380 --> 00:07:22,280 a very different look to it than the population map. 143 00:07:22,280 --> 00:07:26,440 This is something that to me makes a lot more sense or it's more useful is that, 144 00:07:26,440 --> 00:07:28,695 this is downtown Toronto here, 145 00:07:28,695 --> 00:07:31,260 so yes, the population density is much higher. 146 00:07:31,260 --> 00:07:35,130 Look at over here we have a much larger census track, 147 00:07:35,130 --> 00:07:37,040 this is probably one of the biggest ones in the city, 148 00:07:37,040 --> 00:07:39,730 but it turns out that there's really not that many people 149 00:07:39,730 --> 00:07:42,230 living there if you account for area. 150 00:07:42,230 --> 00:07:45,430 So, it actually has a fairly low population density 151 00:07:45,430 --> 00:07:48,630 and that's probably more useful in a lot of situations when you're trying to 152 00:07:48,630 --> 00:07:51,820 interpret things that might be related to government policy 153 00:07:51,820 --> 00:07:55,270 or what politicians might want to use in terms of making decisions. 154 00:07:55,270 --> 00:07:58,680 Often, the density will be more useful than the count. 155 00:07:58,680 --> 00:08:01,010 Not always, but it's definitely something that you want to 156 00:08:01,010 --> 00:08:03,750 take into consideration when you're making a choropleth map. 157 00:08:03,750 --> 00:08:05,380 Here's a comparison between the two. 158 00:08:05,380 --> 00:08:08,355 So, we have total population versus population density. 159 00:08:08,355 --> 00:08:12,715 So, just two different ways of thinking about a variable that you're trying to map 160 00:08:12,715 --> 00:08:17,525 using a choropleth and what's the most representative or useful way to do that. 161 00:08:17,525 --> 00:08:20,500 This example might help you understand how 162 00:08:20,500 --> 00:08:23,830 area can bias results or bias a choropleth map. 163 00:08:23,830 --> 00:08:25,100 I really like this example. 164 00:08:25,100 --> 00:08:29,820 It's from a book by [inaudible] it's a great cartography textbook. 165 00:08:29,820 --> 00:08:35,320 So, the example here is that if you have farmers fields that have been divided up into 166 00:08:35,320 --> 00:08:37,840 different sized areas and you wanted to make a map or 167 00:08:37,840 --> 00:08:41,380 choropleth of how much of those fields had been harvested. 168 00:08:41,380 --> 00:08:45,100 So, notice here that we have 16 acres here, 169 00:08:45,100 --> 00:08:48,710 16 acres here and this is 64 acres there, 170 00:08:48,710 --> 00:08:51,250 and so we're measuring the areas in terms of acres. 171 00:08:51,250 --> 00:08:53,410 If we look at the total acres harvested, 172 00:08:53,410 --> 00:08:55,545 let's say we're harvesting corn, 173 00:08:55,545 --> 00:08:58,300 so here we have no acres harvested, 174 00:08:58,300 --> 00:09:00,125 there was no corn harvested there, 175 00:09:00,125 --> 00:09:04,850 we have 16 acres harvested here and 64 acres harvested here so all it is, 176 00:09:04,850 --> 00:09:09,125 is that these had been divided up into different sized fields. 177 00:09:09,125 --> 00:09:12,240 If we make a choropleth map based on 178 00:09:12,240 --> 00:09:16,570 those raw counts as opposed to accounting for area of normalizing, 179 00:09:16,570 --> 00:09:18,680 then this is the choropleth map that we would get. 180 00:09:18,680 --> 00:09:20,650 So, if we just interpreted this, 181 00:09:20,650 --> 00:09:26,350 we see a light green so that would mean that there was low or no core harvested, 182 00:09:26,350 --> 00:09:28,890 we have a medium green so this would be a medium amount 183 00:09:28,890 --> 00:09:33,160 harvested and this dark green would be a high amount harvested. 184 00:09:33,160 --> 00:09:36,275 So, that's the way that someone would interpret that choropleth map. 185 00:09:36,275 --> 00:09:39,775 But if you actually divided by area, 186 00:09:39,775 --> 00:09:44,965 this is if you added this up 4 times 16 is 64, 187 00:09:44,965 --> 00:09:47,210 so if you actually look at the same size area, 188 00:09:47,210 --> 00:09:51,125 the same amount of corn was being harvested but you've got 189 00:09:51,125 --> 00:09:55,130 two different colors here and it's almost well it is 190 00:09:55,130 --> 00:09:57,110 misleading or almost lying to 191 00:09:57,110 --> 00:10:00,410 somebody is that they'll look at these two different colors and say there was 192 00:10:00,410 --> 00:10:02,850 less corn harvested here and more corn harvested 193 00:10:02,850 --> 00:10:06,000 there based on these counts which is not accurate. 194 00:10:06,000 --> 00:10:09,470 It's misleading and it's not a good way of representing your data. 195 00:10:09,470 --> 00:10:12,960 However, if we divide by area, 196 00:10:12,960 --> 00:10:16,135 then you can see here that these are now 197 00:10:16,135 --> 00:10:19,170 the same color because we've normalized for area, 198 00:10:19,170 --> 00:10:21,285 we've divided by total acres there 199 00:10:21,285 --> 00:10:23,880 and now we've got something that's more representative when somebody looks 200 00:10:23,880 --> 00:10:25,715 at that they say oh these are the same color 201 00:10:25,715 --> 00:10:27,980 that means there's the same amount harvested and that's 202 00:10:27,980 --> 00:10:32,495 what we want them to see is something that's true and more representative of the data. 203 00:10:32,495 --> 00:10:36,110 If we apply exactly the same idea to the census tract data, 204 00:10:36,110 --> 00:10:37,420 let's see what happens there. 205 00:10:37,420 --> 00:10:42,610 So, what I've done here is I have isolated 206 00:10:42,610 --> 00:10:45,105 two census tracks that are quite 207 00:10:45,105 --> 00:10:48,230 different in size and so if we look at the population counts, 208 00:10:48,230 --> 00:10:52,190 you'll see that this really big census track has 12,909 people in 209 00:10:52,190 --> 00:10:56,860 it and this smaller census tract has 13,530 people in it. 210 00:10:56,860 --> 00:10:58,740 So, the population counts are within 211 00:10:58,740 --> 00:11:02,110 five percent of each other and so you think okay yes. 212 00:11:02,110 --> 00:11:05,615 So, if we look at that in terms of a choropleth though, 213 00:11:05,615 --> 00:11:08,770 as a population choropleth these would be the same color 214 00:11:08,770 --> 00:11:12,635 because they would be very similar values they'd be the same color of red. 215 00:11:12,635 --> 00:11:15,695 But if we look at the population densities, 216 00:11:15,695 --> 00:11:20,390 the density of the larger one is 167 people per square kilometer. 217 00:11:20,390 --> 00:11:21,690 The density of the smaller one is 218 00:11:21,690 --> 00:11:27,020 3,232 people per square kilometer so way higher density. 219 00:11:27,020 --> 00:11:30,440 So, the density of one is 19 times higher than the other and 220 00:11:30,440 --> 00:11:33,535 so which do you think makes more sense in terms of trying to compare things, 221 00:11:33,535 --> 00:11:36,660 I think you probably see that it makes more sense usually 222 00:11:36,660 --> 00:11:40,320 to normalize or standardize your data if there's some bias taking 223 00:11:40,320 --> 00:11:43,675 place with choropleth maps and geography areas 224 00:11:43,675 --> 00:11:47,790 the most common way that you'd want to do that or that you'd want to account for. 225 00:11:47,790 --> 00:11:51,740 Just to summarize, we can look at total versus derived values. 226 00:11:51,740 --> 00:11:53,690 So, total values are things like 227 00:11:53,690 --> 00:11:57,540 population counts which are not normally used for choropleths unless you have 228 00:11:57,540 --> 00:11:59,580 a good reason to use them it's not to say that 229 00:11:59,580 --> 00:12:01,730 it's absolutely forbidden or the software won't let you it's 230 00:12:01,730 --> 00:12:03,590 nothing like that it's just that you have to be 231 00:12:03,590 --> 00:12:06,620 conscious of these things and make that decision intentionally. 232 00:12:06,620 --> 00:12:09,160 So, you wouldn't normally use it for things like 233 00:12:09,160 --> 00:12:11,870 population percentage tracked as I've just shown you, 234 00:12:11,870 --> 00:12:15,880 what we do prefer to use for choropleths are things that are derived values. 235 00:12:15,880 --> 00:12:20,030 So, those are ratios involving area like we were just doing like normalizing. 236 00:12:20,030 --> 00:12:23,960 So, population density per census tract or ratios that 237 00:12:23,960 --> 00:12:27,940 are independent of area things like per capita income for a census tracks. 238 00:12:27,940 --> 00:12:31,580 So, those are things that are not biased by area and so that's perfectly fine. 239 00:12:31,580 --> 00:12:33,320 As an example of that, 240 00:12:33,320 --> 00:12:34,670 I just made this map for fun. 241 00:12:34,670 --> 00:12:37,270 It's a ratio of males to females for 242 00:12:37,270 --> 00:12:41,810 different parts of the city these are different neighborhoods and so I just put 243 00:12:41,810 --> 00:12:45,780 the labels on here if you're familiar with Toronto or if you're not and I've used 244 00:12:45,780 --> 00:12:50,050 this diverging color schemes so where the ratio is almost one or very close to one, 245 00:12:50,050 --> 00:12:52,975 in other words there's equal numbers of males and females, 246 00:12:52,975 --> 00:12:57,160 we've got this gray and then I've used a diverging color scheme to show 247 00:12:57,160 --> 00:13:01,850 increasing amounts of these ratios either higher or lower than one. 248 00:13:01,850 --> 00:13:04,085 So, as you can see here, 249 00:13:04,085 --> 00:13:07,640 if you're looking for a lady downtown is the place to be, 250 00:13:07,640 --> 00:13:10,150 if you're looking for a man then I'd say get to 251 00:13:10,150 --> 00:13:13,820 West Humbler-Clairville or Wexford/Maryvale, 252 00:13:13,820 --> 00:13:17,760 I'm just joking around here but the idea is that actually it's 253 00:13:17,760 --> 00:13:21,780 interesting to see that there is variation in the ratio over the city. 254 00:13:21,780 --> 00:13:24,660 I have no idea why that is but this is a way of 255 00:13:24,660 --> 00:13:27,970 showing data that's not biased by area because all we're doing is 256 00:13:27,970 --> 00:13:30,270 dividing one by the other it's a ratio of one thing to 257 00:13:30,270 --> 00:13:33,100 another and the size of the census tract will have 258 00:13:33,100 --> 00:13:38,730 no effect on how many men versus women there would be in a particular location. 259 00:13:38,730 --> 00:13:41,460 So, appropriate data for chloroplasts are things like 260 00:13:41,460 --> 00:13:44,190 statistical or political boundaries where people 261 00:13:44,190 --> 00:13:47,575 have drawn these boundaries to count things like people. 262 00:13:47,575 --> 00:13:51,575 So, this might be population per census divisions something like that. 263 00:13:51,575 --> 00:13:54,805 It's usually not used for continuous data. 264 00:13:54,805 --> 00:13:58,730 So, the reason for this is that the distribution of 265 00:13:58,730 --> 00:14:02,250 data is not related to the boundaries that are being used to count things. 266 00:14:02,250 --> 00:14:04,900 So, for example you could do 267 00:14:04,900 --> 00:14:07,500 this but it really wouldn't make much sense to to make a map of 268 00:14:07,500 --> 00:14:10,380 average rainfall per census division something like that 269 00:14:10,380 --> 00:14:14,345 because there's no connection or relationship between those two variables. 270 00:14:14,345 --> 00:14:18,125 Census divisions were designed to count people for a census, 271 00:14:18,125 --> 00:14:21,240 that has nothing to do with rainfall and the amount of rainfall 272 00:14:21,240 --> 00:14:24,910 that falls at a location has nothing to do with these boundaries. 273 00:14:24,910 --> 00:14:27,255 So, yes if you had boundaries like 274 00:14:27,255 --> 00:14:29,960 watersheds or something like that that would make sense to you could 275 00:14:29,960 --> 00:14:32,245 do a choropleth for that but again that's 276 00:14:32,245 --> 00:14:35,350 an exception it's not the most common way of doing things. 277 00:14:35,350 --> 00:14:37,070 So, let's just have a look at, 278 00:14:37,070 --> 00:14:40,380 take this to an extreme just to see how this happens or how this works. 279 00:14:40,380 --> 00:14:42,190 If you have Toronto neighborhoods, 280 00:14:42,190 --> 00:14:46,470 so I've got a map here of neighborhoods and here we have elevation data. 281 00:14:46,470 --> 00:14:49,340 So, this is an example of continuous data. 282 00:14:49,340 --> 00:14:53,540 If I made a choropleth map of elevation per neighbourhood. 283 00:14:53,540 --> 00:14:56,180 Yes I can do that it's certainly possible but 284 00:14:56,180 --> 00:14:58,990 is this meaningful is this really useful not really. 285 00:14:58,990 --> 00:15:00,530 I had a little fun with this, 286 00:15:00,530 --> 00:15:04,310 I turned it into an extruded prism map or 3D map just to 287 00:15:04,310 --> 00:15:08,300 show you conceptually visually how this is actually working, 288 00:15:08,300 --> 00:15:11,480 is if we took these elevation values for each neighborhood and we extruded 289 00:15:11,480 --> 00:15:14,680 them this is really what the choropleth is implying. 290 00:15:14,680 --> 00:15:19,460 Is that you have these perfectly flat areas 291 00:15:19,460 --> 00:15:23,510 that in terms of elevation and then as you get right to the edge of one of 292 00:15:23,510 --> 00:15:26,900 these boundaries you fall off a cliff and then you get to 293 00:15:26,900 --> 00:15:30,530 the next perfectly level area here and then you'd have 294 00:15:30,530 --> 00:15:34,230 to climb up this cliff to get to this next perfectly level area and so on. 295 00:15:34,230 --> 00:15:37,890 So, the of course that's ridiculous that's not the way elevation works. 296 00:15:37,890 --> 00:15:40,505 So, why would you use these boundaries to 297 00:15:40,505 --> 00:15:43,450 represent a variable that's not related to those boundaries? 298 00:15:43,450 --> 00:15:46,370 It's different for people. If I actually show it this way for people that's fine 299 00:15:46,370 --> 00:15:49,930 because these boundaries were designed for people, 300 00:15:49,930 --> 00:15:51,530 all it's really doing is saying there's 301 00:15:51,530 --> 00:15:53,710 this many people in this area and this many people in that area. 302 00:15:53,710 --> 00:15:58,490 There's nothing bad or unrelated about that it makes total sense 303 00:15:58,490 --> 00:16:00,890 so make a choropleth for that but don't make it for something 304 00:16:00,890 --> 00:16:04,215 that's not related to the boundaries that are being used. 305 00:16:04,215 --> 00:16:07,400 This is a better way of showing that data so this is 306 00:16:07,400 --> 00:16:10,730 continuous data with the neighborhoods draped on top of it. 307 00:16:10,730 --> 00:16:13,790 This shows you this is quite exaggerated Toronto is not quite 308 00:16:13,790 --> 00:16:17,060 nearly as dramatic in terms of the terrain but you 309 00:16:17,060 --> 00:16:20,450 get the idea that you've got a lot of variation in 310 00:16:20,450 --> 00:16:22,940 any particular neighborhood and it's not really going to be 311 00:16:22,940 --> 00:16:26,870 representative to show that as one flat choropleth unit.27910

Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.