All language subtitles for 16. Counting Sort (Theory)

af Afrikaans
sq Albanian
am Amharic
ar Arabic Download
hy Armenian
az Azerbaijani
eu Basque
be Belarusian
bn Bengali
bs Bosnian
bg Bulgarian
ca Catalan
ceb Cebuano
ny Chichewa
zh-CN Chinese (Simplified)
zh-TW Chinese (Traditional)
co Corsican
hr Croatian
cs Czech
da Danish
nl Dutch
en English
eo Esperanto
et Estonian
tl Filipino
fi Finnish
fr French
fy Frisian
gl Galician
ka Georgian
de German
el Greek
gu Gujarati
ht Haitian Creole
ha Hausa
haw Hawaiian
iw Hebrew
hi Hindi
hmn Hmong
hu Hungarian
is Icelandic
ig Igbo
id Indonesian
ga Irish
it Italian
ja Japanese
jw Javanese
kn Kannada
kk Kazakh
km Khmer
ko Korean
ku Kurdish (Kurmanji)
ky Kyrgyz
lo Lao
la Latin
lv Latvian
lt Lithuanian
lb Luxembourgish
mk Macedonian
mg Malagasy
ms Malay
ml Malayalam
mt Maltese
mi Maori
mr Marathi
mn Mongolian
my Myanmar (Burmese)
ne Nepali
no Norwegian
ps Pashto
fa Persian
pl Polish
pt Portuguese
pa Punjabi
ro Romanian
ru Russian
sm Samoan
gd Scots Gaelic
sr Serbian
st Sesotho
sn Shona
sd Sindhi
si Sinhala
sk Slovak
sl Slovenian
so Somali
es Spanish
su Sundanese
sw Swahili
sv Swedish
tg Tajik
ta Tamil
te Telugu
th Thai
tr Turkish
uk Ukrainian
ur Urdu
uz Uzbek
vi Vietnamese
cy Welsh
xh Xhosa
yi Yiddish
yo Yoruba
zu Zulu
or Odia (Oriya)
rw Kinyarwanda
tk Turkmen
tt Tatar
ug Uyghur
Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 1 00:00:05,250 --> 00:00:07,890 The algorithms we've looked at so far 2 2 00:00:07,890 --> 00:00:11,000 don't make any assumptions about the data they're sorting. 3 3 00:00:11,000 --> 00:00:15,170 We can sort integers, floats, strings, objects anything. 4 4 00:00:15,170 --> 00:00:18,380 The algorithms don't assume a certain data type. 5 5 00:00:18,380 --> 00:00:20,460 Specific implementations do 6 6 00:00:20,460 --> 00:00:22,700 but not the algorithms themselves. 7 7 00:00:22,700 --> 00:00:25,240 They also don't assume that the data being sorted 8 8 00:00:25,240 --> 00:00:27,290 is bound in any way. 9 9 00:00:27,290 --> 00:00:29,830 For example, they don't assume that all the values 10 10 00:00:29,830 --> 00:00:32,340 being sorted will be less than 100. 11 11 00:00:32,340 --> 00:00:35,100 There are algorithms that do make assumptions 12 12 00:00:35,100 --> 00:00:36,410 about the data being sorted 13 13 00:00:36,410 --> 00:00:38,190 and because they make assumptions 14 14 00:00:38,190 --> 00:00:40,730 they can sort the data more efficiently. 15 15 00:00:40,730 --> 00:00:45,730 In fact they can achieve linear time, O to the N complexity 16 16 00:00:46,170 --> 00:00:50,010 and so the next couple of algorithms we're going to look at 17 17 00:00:50,010 --> 00:00:52,980 make assumptions about the data they're sorting 18 18 00:00:52,980 --> 00:00:56,840 and the first algorithm is called counting sort. 19 19 00:00:56,840 --> 00:00:59,040 So counting sort makes assumptions 20 20 00:00:59,040 --> 00:01:00,890 about the data that it's sorting, 21 21 00:01:00,890 --> 00:01:04,550 it assumes that all the values are discrete 22 22 00:01:04,550 --> 00:01:07,530 and they're within a specific range. 23 23 00:01:07,530 --> 00:01:09,220 So this algorithm only works 24 24 00:01:09,220 --> 00:01:11,250 with non-negative discrete values. 25 25 00:01:11,250 --> 00:01:15,390 We can't sort floating point numbers, we can't sort strings 26 26 00:01:15,390 --> 00:01:18,980 because they don't have discrete values. 27 27 00:01:18,980 --> 00:01:21,550 So practically speaking, you'll usually use 28 28 00:01:21,550 --> 00:01:23,730 this algorithm with whole numbers. 29 29 00:01:23,730 --> 00:01:26,700 So the way it works is it doesn't use comparisons, 30 30 00:01:26,700 --> 00:01:29,090 so it doesn't actually compare values 31 31 00:01:29,090 --> 00:01:30,860 in the array against each other 32 32 00:01:30,860 --> 00:01:35,660 instead it counts the number of occurrences of each value 33 33 00:01:35,660 --> 00:01:37,940 and you'll see how it does this 34 34 00:01:37,940 --> 00:01:40,480 when we go through the algorithm 35 35 00:01:40,480 --> 00:01:43,290 for a specific array in just a minute. 36 36 00:01:43,290 --> 00:01:47,110 Now as I said, values must be within a specific range 37 37 00:01:47,110 --> 00:01:49,130 and the range has to be reasonable. 38 38 00:01:49,130 --> 00:01:50,970 It can't be huge. 39 39 00:01:50,970 --> 00:01:52,830 You're not going to use counting sort 40 40 00:01:52,830 --> 00:01:55,560 to sort values that are between 41 41 00:01:55,560 --> 00:01:57,930 one and a million, for example. 42 42 00:01:57,930 --> 00:02:01,050 Now our array, the usual array that we've been using, 43 43 00:02:01,050 --> 00:02:03,620 is not a good candidate for counting sort. 44 44 00:02:03,620 --> 00:02:06,400 It has negative numbers, so we're going to use a different 45 45 00:02:06,400 --> 00:02:10,260 array to go through this algorithm by hand 46 46 00:02:10,260 --> 00:02:11,940 and so this is the array we're going to use. 47 47 00:02:11,940 --> 00:02:15,270 We're going to assume that we're sorting values 48 48 00:02:15,270 --> 00:02:17,900 that are between one and 10 inclusive, 49 49 00:02:17,900 --> 00:02:20,900 so the values can be one, two, three, four, 50 50 00:02:20,900 --> 00:02:23,340 five, six, seven, eight, nine or 10. 51 51 00:02:23,340 --> 00:02:26,900 We have 10 possible values and so what we do 52 52 00:02:26,900 --> 00:02:30,150 is we create a counting array of length 10. 53 53 00:02:30,150 --> 00:02:32,160 So this is going to be a separate array 54 54 00:02:32,160 --> 00:02:34,310 from the one that we're sorting 55 55 00:02:34,310 --> 00:02:36,350 and then we traverse the input array, 56 56 00:02:36,350 --> 00:02:38,700 the one we're sorting, from left to right 57 57 00:02:38,700 --> 00:02:41,590 and using the counting array we track how many 58 58 00:02:41,590 --> 00:02:43,750 of each value are in the input array 59 59 00:02:43,750 --> 00:02:47,010 and then once we've traversed the entire input array, 60 60 00:02:47,010 --> 00:02:49,590 using the counts in the counting array, 61 61 00:02:49,590 --> 00:02:53,790 we write the values in sorted order back to the input array. 62 62 00:02:53,790 --> 00:02:55,780 And so we're going to sort the array 63 63 00:02:55,780 --> 00:02:57,350 that you see on the screen 64 64 00:02:57,350 --> 00:03:00,220 and so when i = 0, we're going to look 65 65 00:03:00,220 --> 00:03:04,010 at the value at position zero and we see it's two 66 66 00:03:04,010 --> 00:03:08,490 and so we add one to the counting array at position one 67 67 00:03:08,490 --> 00:03:11,290 because we know the values are between one and 10, 68 68 00:03:11,290 --> 00:03:13,750 in the counting array position zero 69 69 00:03:13,750 --> 00:03:16,520 is going to hold the number of one's we find. 70 70 00:03:16,520 --> 00:03:19,710 Position one will hold the number of two's we find. 71 71 00:03:19,710 --> 00:03:23,800 Position two will hold the number of three's we find etc 72 72 00:03:23,800 --> 00:03:26,380 and so in the first position we found a two 73 73 00:03:26,380 --> 00:03:30,550 so we increment counting array index one, 74 74 00:03:30,550 --> 00:03:31,930 we increment it by one. 75 75 00:03:31,930 --> 00:03:34,020 Then we're going to increment i to one 76 76 00:03:34,020 --> 00:03:38,160 and at the input array position one we have a five 77 77 00:03:38,160 --> 00:03:41,580 so we're going to increment counting array four 78 78 00:03:41,580 --> 00:03:44,640 because that's where we're counting the number of fives. 79 79 00:03:44,640 --> 00:03:47,960 We increment i to two, we have a nine, 80 80 00:03:47,960 --> 00:03:50,140 so we're going to increment counting array 81 81 00:03:50,140 --> 00:03:54,000 position eight by one because we found one nine. 82 82 00:03:54,000 --> 00:03:55,820 When i is three we have an eight, 83 83 00:03:55,820 --> 00:03:59,390 so we're going to increment counting array seven by one. 84 84 00:03:59,390 --> 00:04:01,980 At position four we found another two 85 85 00:04:01,980 --> 00:04:05,430 and so we're going to increment counting array one by one, 86 86 00:04:05,430 --> 00:04:06,960 so now we have two there because 87 87 00:04:06,960 --> 00:04:09,860 so far we've seen two two's in our array. 88 88 00:04:09,860 --> 00:04:13,110 Next we find another eight, so counting array seven 89 89 00:04:13,110 --> 00:04:15,700 will be incremented by one so we now have a two there 90 90 00:04:15,700 --> 00:04:17,550 because we've seen two eight's. 91 91 00:04:17,550 --> 00:04:19,270 Then we find a seven so we're going to 92 92 00:04:19,270 --> 00:04:21,940 increment counting array six. 93 93 00:04:21,940 --> 00:04:23,390 We found a 10 so we're going to 94 94 00:04:23,390 --> 00:04:25,660 increment counting array nine. 95 95 00:04:25,660 --> 00:04:27,210 We found a four, so we going to 96 96 00:04:27,210 --> 00:04:28,820 increment counting array three 97 97 00:04:29,680 --> 00:04:31,600 and finally we find a three 98 98 00:04:31,600 --> 00:04:34,800 and so we're going to increment counting array two by one. 99 99 00:04:34,800 --> 00:04:38,140 And so at this point we've traversed our entire input array 100 100 00:04:38,140 --> 00:04:40,250 and we have counted how many occurrences 101 101 00:04:40,250 --> 00:04:41,730 of each value we've seen 102 102 00:04:41,730 --> 00:04:46,090 and so what we want to do now is using the counting array, 103 103 00:04:46,090 --> 00:04:49,920 we're going to write back the values to the original array. 104 104 00:04:49,920 --> 00:04:53,430 So in this slide the counting array is at the top 105 105 00:04:53,430 --> 00:04:56,220 and the input array is at the bottom, 106 106 00:04:56,220 --> 00:04:57,310 after we've written it back. 107 107 00:04:57,310 --> 00:04:58,500 So the way this would work is, 108 108 00:04:58,500 --> 00:05:00,800 we'd traverse the counting array 109 109 00:05:00,800 --> 00:05:04,500 and we would say okay so there is zero one's 110 110 00:05:04,500 --> 00:05:06,200 and so we don't have to do anything. 111 111 00:05:06,200 --> 00:05:08,870 We just move to the next position in the counting array. 112 112 00:05:08,870 --> 00:05:10,510 Okay there's two two's 113 113 00:05:10,510 --> 00:05:12,330 and so at this point we're gonna write 114 114 00:05:12,330 --> 00:05:14,890 two two's into the input array 115 115 00:05:14,890 --> 00:05:16,220 and then there's one three 116 116 00:05:16,220 --> 00:05:19,100 so we'll write one three to the input array. 117 117 00:05:19,100 --> 00:05:23,260 There's one four so we'll write one four to the input array. 118 118 00:05:23,260 --> 00:05:27,330 There's one five so we'll write one five to the input array. 119 119 00:05:27,330 --> 00:05:30,360 There are zero sixes, so we don't do anything. 120 120 00:05:30,360 --> 00:05:32,550 There is one seven, so we write 121 121 00:05:32,550 --> 00:05:34,740 one seven to the input array. 122 122 00:05:34,740 --> 00:05:36,900 There are two eights, so we're gonna write 123 123 00:05:36,900 --> 00:05:39,070 two eight's to the input array. 124 124 00:05:39,070 --> 00:05:42,940 There's one nine, so we write one nine to the input array 125 125 00:05:42,940 --> 00:05:44,600 and finally there's one 10 126 126 00:05:44,600 --> 00:05:47,190 and so we write one 10 to the input array. 127 127 00:05:47,190 --> 00:05:49,800 So essentially counting sort has two phases. 128 128 00:05:49,800 --> 00:05:51,987 The first phase is we traverse the input array 129 129 00:05:51,987 --> 00:05:54,980 and we count how many of each value we have 130 130 00:05:54,980 --> 00:05:58,520 and then in the second phase using the counting array, 131 131 00:05:58,520 --> 00:06:01,380 we write values back into the input array 132 132 00:06:01,380 --> 00:06:04,920 and so as you can see our input array is now sorted 133 133 00:06:04,920 --> 00:06:06,790 and we didn't do any comparisons. 134 134 00:06:06,790 --> 00:06:10,100 We didn't compare any of the elements against each other. 135 135 00:06:10,100 --> 00:06:13,530 We're just merely counting, how many of each value we have 136 136 00:06:13,530 --> 00:06:16,480 and that's why the algorithm is called counting sort. 137 137 00:06:16,480 --> 00:06:19,070 Now this is not an in-place algorithm. 138 138 00:06:19,070 --> 00:06:21,850 As we've seen we have to create a counting array 139 139 00:06:21,850 --> 00:06:23,830 and the length of the counting array 140 140 00:06:23,830 --> 00:06:28,020 will depend on the range of values we have, 141 141 00:06:28,020 --> 00:06:31,770 and because of this counting sort is best used, 142 142 00:06:31,770 --> 00:06:34,250 when the range of values you have 143 143 00:06:34,250 --> 00:06:38,240 is around the same length of the input array. 144 144 00:06:38,240 --> 00:06:41,700 So if you have an input array of 20, 145 145 00:06:41,700 --> 00:06:43,930 ideally the number of values that there are, 146 146 00:06:43,930 --> 00:06:45,680 you want to be that to be around 20 147 147 00:06:45,680 --> 00:06:49,460 because think about if you have an array of 10 elements 148 148 00:06:49,460 --> 00:06:52,750 and the range of values can go from one to 10,000, 149 149 00:06:52,750 --> 00:06:55,100 then your going to have to create a counting array 150 150 00:06:55,100 --> 00:06:58,140 of length 10,000 to sort 10 elements 151 151 00:06:58,140 --> 00:07:00,610 which would obviously be ridiculous. 152 152 00:07:00,610 --> 00:07:03,630 So counting sort works best when 153 153 00:07:03,630 --> 00:07:06,530 the range of values that you can have, 154 154 00:07:06,530 --> 00:07:08,380 the number of values in that range 155 155 00:07:08,380 --> 00:07:10,670 is roughly equivalent to the number of values 156 156 00:07:10,670 --> 00:07:12,780 that you want to sort. 157 157 00:07:12,780 --> 00:07:15,750 So in our case we had a range of 10 values 158 158 00:07:15,750 --> 00:07:18,800 and we had an array of 10 so it was perfect. 159 159 00:07:18,800 --> 00:07:22,610 As I mentioned this algorithm can achieve linear time 160 160 00:07:22,610 --> 00:07:24,730 and it can do that because it's making 161 161 00:07:24,730 --> 00:07:26,340 assumptions about the data 162 162 00:07:26,340 --> 00:07:28,590 and so because it can make these assumptions, 163 163 00:07:28,590 --> 00:07:31,220 it can achieve linear time. 164 164 00:07:31,220 --> 00:07:34,330 Now the counting sort I showed you is unstable. 165 165 00:07:34,330 --> 00:07:37,690 If we want it to be stable there are extra steps 166 166 00:07:37,690 --> 00:07:39,790 we can take with counting sort 167 167 00:07:39,790 --> 00:07:42,990 instead of just incrementing a counter for example 168 168 00:07:42,990 --> 00:07:45,550 you would perhaps use what's called linked list 169 169 00:07:45,550 --> 00:07:48,530 in the counting array, we're gonna cover linked list later. 170 170 00:07:48,530 --> 00:07:51,050 Okay that's it for learning what 171 171 00:07:51,050 --> 00:07:52,990 the counting sort algorithm is. 172 172 00:07:52,990 --> 00:07:54,380 Let's go ahead and implement it. 173 173 00:07:54,380 --> 00:07:55,930 I'll see you in the next video. 15419

Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.